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IMPROVED RAPID SUBCLONING 
USING SITE-SPECIFIC RECOMBINATION 

This is a Continuation-In-Part Application of pending Application Ser. No. 08/864,224, 
filed February 28, 1997. 

5 FIELD OF THE INVENTION 

The invention relates to recombinant DNA technology. In particular, the 
invention relates to compositions, including vectors, and methods for the rapid 
subcloning of nucleic acid sequences in vivo and in vitro. 

BACKGROUND OF THE INVENTION 

10 Molecular biotechnology has revolutionized the production of protein and 

polypeptide compounds of pharmacological importance. The advent of recombinant 
DNA technology permitted for the first time the production of proteins on a large scale 
in a recombinant host cell rather than by the laborious and expensive isolation of the 
protein from tissues which may only contain minute quantities of the desired protein 

15 (e.g., isolation of human growth hormone from cadaver pituitary). The production of 
proteins, including human proteins, on a large scale in a heterologous host requires the 
ability to express the protein of interest in the heterologous host. This process 
typically involves isolation or cloning of the gene encoding the protein of interest 
followed by transfer of the coding region into an expression vector that contains 

20 elements (e.g., promoters) which direct the expression of the desired protein in the 

heterologous host cell. The most commonly used means of transferring or subcloning 
a coding region into an expression vector involves the in vitro use of restriction 
endonucleases and DNA ligases. Restriction endonucleases are enzymes which 
generally recognize and cleave a specific DNA sequence in a double-stranded DNA 

25 molecule. Restriction enzymes are used to excise the coding region from the cloning 

vector and the excised DNA fragment is then joined using DNA ligase to a suitably 
cleaved expression vector in such a manner that a functional protein may be expressed. 
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The ability to transfer the desired coding region to an expression vector is often 
limited by the availability or suitability of restriction enzyme recognition sites. Often 
multiple restriction enzymes must be employed for the removal of the desired coding 
region and the reaction conditions used for each enzyme may differ such that it is 

5 necessary to perform the excision reactions in separate steps. In addition, it may be 

necessary to remove a particular enzyme used in an initial restriction enzyme reaction 
prior to completing all restriction enzyme digestions; this requires a time-consuming 
purification of the subcloning intermediate. Ideal methods for the subcloning of DNA 
molecules would permit the rapid transfer of the target DNA molecule from one vector 

10 to another in vitro or in vivo without the need to rely upon restriction enzyme 
digestions. 

SUMMARY OF THE INVENTION 

The present invention provides reagents and methods which comprise a system 
for the rapid subcloning of nucleic acid sequences in vivo and in vitro without the need 

15 to use restriction enzymes. 

The present invention provides a method for the recombination of nucleic acid 
constructs, comprising: providing a first nucleic acid construct comprising, in operable 
order, an origin of replication, a first sequence-specific recombinase target site, and a 
nucleic acid of interest, a second nucleic acid construct comprising, in operable order, 

20 an origin of replication, a regulatory element and a second sequence- specific 

recombinase target site adjacent to and downstream from the regulatory element, and a 
site-specific recombinase; contacting the first and the second nucleic acid constructs 
with the site-specific recombinase under conditions such that the first and second 
nucleic acid constructs are recombined to form a third nucleic acid construct, wherein 

25 the nucleic acid of interest is operably linked to the regulatory element. The present 
invention contemplates the use of any type of regulatory element. In some 
embodiments of the present invention, the regulatory element comprises a promoter 
element, a fusion peptide (e.g., an affinity domain), or an epitope tag. In preferred 
embodiments, the nucleic acid of interest comprises a gene. 
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In some embodiments, the first nucleic acid construct further comprises a 
selectable marker. In other embodiments, the second nucleic acid construct further 
comprises a selectable marker. The present invention contemplates that the first and 
second nucleic acid constructs both comprise selectable markers. In preferred 
5 embodiments the selectable markers of the first and second nucleic acid constructs are 
different from one another. Selectable markers include, but are not limited to a 
kanamycin resistance gene, an ampicillin resistance gene, a tetracycline resistance 
gene, a chloramphenicol resistance gene, a streptomycin resistance gene, a 
spectinomycin resistance gene, the aadA gene, the OX 174 E gene, the strA gene, and 

1 0 the sacB gene. 

In preferred embodiments, the first nucleic acid construct further comprises a 
prokaryotic termination sequence. Prokaryotic termination sequences include, but are 
not limited to the T7 termination sequence. In other preferred embodiments, the first 
nucleic acid construct further comprises a eukaryotic polyadenylation sequence. 

1 5 Polyadenylation sequences include, but are not limited to, the bovine growth hormone 

polyadenylation sequence, the simian virus 40 polyadenylation sequence, and the 
Herpes Simplex virus thymidine kinase polyadenylation sequence. In yet other 
preferred embodiments, the first nucleic acid construct further comprises a conditional 
origin of replication. 

20 In preferred embodiments of the present invention, the first and second 

sequence-specific recombinase target sites are selected from the group consisting of 
loxP, /oxP2, hxP3, /oxP23, /oxPSll, /oxB, loxC2, loxL, loxR, /oxA86, loxM\l,fi% 
dif, loxH and att. The present invention contemplates that the first and second 
sequence-specific recombinase target sites may comprise the same sequence or may 

25 comprise different sequences. 

In yet other embodiments of the present invention, the first nucleic acid 
construct further comprises a polylinker. 

The present invention contemplates that the recombination methods can be used 
in vitro and in vivo. In some in vivo embodiments, the site-specific recombinase is 
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provided by a host cell expressing the site-specific recombinase. In some in vivo 
methods, the contacting of the first and the second nucleic acid constructs with the 
site-specific recombinase comprises introducing the first and said second nucleic acid 
constructs into a host cell under conditions such that the third nucleic acid construct is 
5 capable of replicating in the host cell. 

The present invention further provides methods for precise transfer of nucleic 
acid molecules by recombination. In some embodiments, the first nucleic acid 
construct further comprises a third sequence-specific recombinase target site and said 
second nucleic acid constructs further comprises a fourth sequence-specific 

10 recombinase target site. In preferred embodiments, the first sequence-specific 

recombinase and the third sequence-specific recombinase in the first nucleic acid 
construct are located on opposite sides of the nucleic acid of interest. It is 
contemplated that the first and third sequence-specific recombinase target sites are 
contiguous with, adjacent to, or distant from the nucleic acid of interest. In 

15 particularly preferred embodiments the third and fourth sequence-specific recombinase 

target sites are selected from the group consisting of RS sites and Res sites, although 
other target sites are contemplated by the present invention. In some embodiments of 
the this method of the present invention, the first nucleic acid construct further 
comprises a third sequence-specific recombinase target site and the second nucleic acid 

20 constructs further comprises a fourth sequence-specific recombinase target site, wherein 

the method further comprises providing a second site-specific recombinase and the step 
of contacting the third nucleic acid construct with the second site-specific recombinase 
under conditions such that the third nucleic acid construct is recombined to form a 
fourth and a fifth nucleic acid construct. 

25 The present invention also provides a recombined nucleic acid construct 

prepared according to any of the above methods. 

The present invention further provides a method for the recombination of 
nucleic acid constructs, comprising: providing a vector, a linear nucleic acid molecule 
comprising a sequence complementary to at least a portion of said vector, and an E. 

30 coli host cell, wherein said host cell comprises an endogenous recombination system, a 




loss of function rec mutation, a suppressor, and a loss of function endogenous 
restriction modification system mutation; and introducing the vector and the linear 
nucleic acid molecule into the host cell under conditions such that the linear nucleic 
acid molecule and the vector are recombined to form a recombinant nucleic acid 
5 construct. In preferred embodiments the loss of function rec mutation is selected from 
the group consisting of recBC and recD. In other preferred embodiments, the 
suppressor comprises she. In yet other preferred embodiments, the loss of function 
endogenous restriction modification system mutation comprises hsdR. 

The present invention further provides a method for generating a nucleic acid 

10 fusion on the 3' end of the nucleic acid of interest in the first nucleic acid construct 
from above, comprising: providing a tagged linear nucleic acid sample comprising a 
tag to be added to the 3' end of the nucleic acid of interest, and a sequence 
complementary to a region of the first nucleic acid construct that is 3' of the nucleic 
acid of interest; and a host cell capable of endogenous homologous recombination of 

15 complementary nucleic acid molecules; and introducing the tagged linear nucleic acid 

sample and the first nucleic acid construct into the host cell under conditions such that 
the tagged linear nucleic acid sample and the first nucleic acid construct are 
recombined to form a tagged nucleic acid construct. 

The present invention further provides a method for the cloning of nucleic acid 

20 libraries, comprising: providing a plurality of first nucleic acid constructs comprising, 

in operable order, an origin of replication, a first sequence-specific recombinase target 
site, and a nucleic acid member from a nucleic acid library, a plurality of second 
nucleic acid construct comprising, in operable order, an origin of replication, a 
regulatory element and a second sequence-specific recombinase target site adjacent to 

25 and downstream from the regulatory element, and a site-specific recombinase; 

contacting the plurality of first and second nucleic acid constructs with the site-specific 
recombinase under conditions such that the plurality of first and second nucleic acid 
constructs are recombined to form a plurality of third nucleic acid constructs, wherein 
the nucleic acid members from the nucleic acid library are operably linked to the 




regulatory elements. The present invention further provides a nucleic acid library 
prepared according to the above method. 

The present invention also provides a method for the directional cloning of a 
nucleic acid molecule, comprising: providing first and second portions of a regulatory 
5 element, a first nucleic acid molecule comprising the first portion of the regulatory 
element; and a second nucleic acid molecule comprising the second portion of the 
regulatory element; and combining the first and the second nucleic acid molecules to 
produce a third nucleic acid molecule under conditions whereby an intact regulatory 
element is produced from the combination of the first and the second portions of the 

10 regulatory element, wherein the presence of the intact regulatory element in the third 
nucleic acid molecule indicates a direction of cloning of the first nucleic acid molecule 
with respect to the second nucleic acid molecule. 

The present invention also provides a method for the directional cloning of a 
nucleic acid molecule, comprising providing: the nucleic acid molecule to be cloned, a 

1 5 first primer comprising sequence complementary to the nucleic acid molecule, a 

second primer comprising sequence complementary to the nucleic acid molecule and 
sequence corresponding to a first portion of a lacO site, amplification means, and a 
target nucleic acid molecule comprising a second portion of the lacO site; amplifying 
the nucleic acid molecule with the first and second primers to produce a modified 

20 nucleic acid molecule comprising the first portion of a lacO site; and ligating the 

modified nucleic acid molecule into the target nucleic acid such that, when cloned in 
the desired direction, an intact lacO site is produced. In some embodiments, the 
method further comprises the step of detecting the intact lacO site. In particularly 
preferred embodiments, the target nucleic acid molecule comprises pUNI-30. 

25 The present invention further provides a method for regulated recombination in 

host cells that constitutively express a recombinase, comprising: providing a host cell 
expressing a recombinase, a first nucleic acid construct comprising an origin of 
replication, a first site-specific recombinase site, a second site-specific recombinase site 
that differs in sequence from the first site-specific recombinase site such that the 



recombinase will not initiate recombination between the first and second site-specific 
recombinase sites, and a selectable marker gene between the first and second site- 
specific recombinase sites, and a second nucleic acid construct comprising an origin of 
replication, a third site-specific recombinase target site, and a fourth site-specific 
recombinase target site that differs in sequence from the third site-specific recombinase 
site such that the recombinase will not initiate recombination between the third and 
fourth site-specific recombinase sites; and introducing the first and second nucleic acid 
constructs into the host cell under conditions such that the first and second nucleic acid 
constructs are recombined. In some embodiments, the method further comprises the 
step of selecting for a desired recombinant nucleic acid molecule using the selectable 
marker. In preferred embodiments, the first nucleic acid construct is a Univector. In 
alternative preferred embodiments, the second nucleic acid construct is a Univector. 

The present invention also provides, a nucleic acid construct comprising, in 
operable order: a conditional origin of replication; a sequence-specific recombinase 
target site having a 5' and a 3' end; and a unique restriction enzyme site, said 
restriction enzyme site located adjacent to the 3' end of the sequence-specific 
recombinase target site. In some embodiments, the construct further comprises a 
prokaryotic termination sequence. In yet other embodiments, the construct further 
comprises a eukaryotic polyadenylation sequence. The present invention contemplates 
the use of any prokaryotic termination sequence and any eukaryotic polyadenylation 
sequence. In preferred embodiments, the construct further comprises one or more 
selectable marker genes. Selectable marker genes include, but are not limited to the 
kanamycin resistance gene, the ampicillin resistance gene, the tetracycline resistance 
gene, the chloramphenicol resistance gene, the streptomycin resistance gene, the strA 
gene, and the sacB gene. In preferred embodiments, the sequence-specific 
recombinase target site is selected from the group consisting of /oxP, /axP2, /oxP3, 
/axP23, /oxPSll, loxB, loxC2, loxL, loxR, fecA86, loxMM.frt, dif, loxH and att. 

In some embodiments the construct further comprises a gene of interest inserted 
into the unique restriction enzyme site. In particular embodiments, the construct has 
the nucleotide sequence set forth in SEQ ID NO:l (Figure 26A). In other 



10 



embodiments, the construct further comprises a second sequence-specific recombinase 
target site. In preferred embodiments, the second sequence-specific recombinase target 
site is selected from the group consisting of RS site and a Res site. In yet other 
embodiments, the construct further comprises a polylinker. 

The present invention further provides a nucleic acid construct comprising in 5' 
to 3' operable order: an origin of replication; a promoter element having a 5' and a 3 
end; and a sequence-specific recombinase target site having a 5' and a 3' end. In 
some embodiments, the construct further comprises a selectable marker gene. 

The present invention also provides a nucleic acid construct comprising in 
operable order: a promoter element having a 5' and a 3' end; a first sequence-specific 
recombinase target site having a 5' and a 3' end, wherein the 3' end of the promoter 
element is located upstream of the 5' end of the sequence-specific recombinase target 
site; a gene of interest joined to the 3' end of the sequence-specific recombinase target 
site such that a functional translational reading frame is created; a conditional origin of 
replication; a first selectable marker gene; a second sequence-specific recombinase 
target site; and an origin of replication. In some embodiments, the construct further 
comprises a second selectable marker gene. 

The present invention also provides a method for the recombination of nucleic 
acid constructs, comprising: providing a first nucleic acid construct comprising a loxH 
20 site, a second nucleic acid construct comprising a loxH site; and a site-specific 

recombinase; and contacting the first and the second nucleic acid constructs with the 
site-specific recombinase under conditions such that the first and second nucleic acid 
constructs are recombined. The present invention also provides a recombined nucleic 
acid construct prepared according to the above method. 

25 DESCRIPTION OF THE DRAWINGS 

Figure 1 provides a schematic illustrating certain elements of the pUNI vectors 
and the Univector Fusion System. 
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Figure 2A provides a schematic map of the pUNI-10 vector; the locations of 
selected restriction enzyme sites are indicated and unique sites are indicated by the use 
of bold type. 

Figure 2B shows the DNA sequence of the lox? site and the polylinkers 
contained within pUNI-10 (i.e., nucleotides 401-530 of SEQ ID NO:l). 

Figure 3A shows the oligonucleotides (SEQ ID NOS:4 and 5) which were 
annealed to insert a lox? site into the polylinker of pGEX-2TKcs to create pGst-/ox. 

Figure 3B provides a schematic map of pGEX-2TKcs which includes an 
enlargement of the multiple cloning site (MCS). 

Figure 4A shows the oligonucleotides (SEQ ID NOS:6 and 7) which were 
annealed to insert a lox? site into the polylinker of pVL1392 to create pVL1392-/oje. 

Figure 4B provides a schematic map of pVL1392 which includes an 
enlargement of the multiple cloning site (MCS); the ampicillin resistance gene (Ap R ) 
and the tac promoter (P^) are indicated. 

Figure 5A shows the oligonucleotides (SEQ ID NOS:8 and 9) which were 
annealed to insert a lox? site into the polylinker of pGAP24 to create pGAP24-/ox. 

Figure 5B provides a schematic map of pGAP24 which includes an 
enlargement of the multiple cloning site (MCS); the ampicillin resistance gene (Ap R ), 
the GAP promoter (P GAP ), the origin from the 2um circle (2u) and the TRP1 gene, 
encoding N-(5'-phosphoribosyl)-anthranilate synthetase, (TRP1) are indicated. 

Figure 6A shows the oligonucleotides (SEQ ID NOS:8 and 9) which were 
annealed to insert a lox? site into the polylinker of pGAL14 to create pGAL14-/ox. 

Figure 6B provides a schematic map of pGAL14 which includes an 
enlargement of the multiple cloning site (MCS); the ampicillin resistance gene (Ap R ), 
the GAL promoter (P GAL ), the yeast centromeric sequences (CEN), yeast autonomous 
replication sequences (ARS) and the TRP1 gene (TRP1) are indicated. 

Figure 7 shows a Coomassie blue-stained SDS-PAGE gel showing the 
purification of Gst-Cre from E. coli cells containing pQL123. 



Figure 8 provides a schematic showing the strategy employed for the in vitro 
recombination of a pUNI vector ( 1, pA," pUNI-5) with a pHOST vector ("pB," 
pQL103) to create a fused construct ("pAB"). The relevant markers on each construct 
are indicated, as are selected restriction enzyme sites. 

Figure 9A provides a schematic showing the starting constructs (pUNI-Skpl 
and pGst-/ojc) and the predicted fusion construct (pGst-Skpl) generated by an in vitro 
fusion reaction. 

Figure 9B provides an ethidium bromide-stained gel showing the separation of 
restriction fragments generated by the digestion of pUNI-Skpl, pGst-/ojc and pGst- 
SkpL 

Figure 10A shows a Coomassie blue-stained SDS-PAGE gel showing the 
expression of the Gst-Skpl protein from E. coli cells containing pGst-SkpL 

Figure 10B shows a Western blot of an SDS-PAGE gel containing extracts 
prepared from E. coli cells containing pGst-Skpl which was probed using an anti-Skpl 
antibody. 

Figure 1 1 shows a Western blot of an SDS-PAGE gel containing extracts 
prepared from E. coli cells (QLB4) containing either a conventionally constructed Gst- 
Skpl plasmid or pGst-Skpl (produced by an in vitro fusion reaction). 

Figure 12 provides a schematic illustrating the in vivo gene trap method for the 
recombination of /ox-containing vectors in a host cell constitutively expressing the Cre 
protein. 

Figure 13 provides the nucleotide sequence of the wild-type lox? site (SEQ ID 
NO:12), the /oxP2 site (SEQ ID NO: 13), the /ojcP3 site (SEQ ID NO:14) and the 
/oxP23 site (SEQ ID NO: 15). 

Figure 14 shows a schematic for one embodiment of Cre-mediated plasmid 

fusion. 

Figure 15 shows data demonstrating the efficiency of Gst-Cre recombinase 
activity as measured by UPS. 
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Figure 16 shows the protein expression of UPS generated fusion proteins 
containing lox? following separation by SDS-PAGE and (A) staining with Coomassie 
blue, and (B) immunoblotting with anti-Skpl antibodies. 

Figure 17 shows a comparison of expression levels between lox? and loxH 
containing constructs. 

Figure 1 8 shows the expression of UPS-derived baculovirus expression 
constructs in insect cells. 

Figure 19 shows immunblotting with anti-HA antibodies of Hela cells 
expressing Myc-tagged F-box protein under the control of the CMV promoter. 

Figure 20 shows a schematic representation of the POT reaction. 

Figure 21 shows restriction digestion assays of sample that underwent POT 
with SKP1 replacing the E gene in pAS2-£. 

Figure 22 shows a schematic of a method for directional subcloning of nucleic 
acid samples into a Univector. 

Figure 23 provides a schematic map of the pUNI-10, pUNI-20, and pUNI-30 
vectors. 

Figure 24 shows a schematic of a method for producing a tagged recombinant 

protein. 

Figure 25 shows a schematic of a gap repair scheme for modification of the 3 ? 
end of coding regions using homologous recombination. 

Figure 26 shows the sequence for: A) SEQ ID NO:l; B) SEQ ID NO:10; and 
C) SEQ ID NO: 11. 

DEFINITIONS 

To facilitate understanding of the invention, a number of terms are defined 

below. 

As used herein, "a conditional origin of replication" refers to an origin of 
replication that requires the presence of a functional trans-acting factor {e.g., a 
replication factor) in a prokaryotic host cell. Conditional origins of replication include, 
but are not limited to, temperature-sensitive replicons such as rep pSClOP, 




As used herein, the term "origin of replication" refers to an origin of replication 
that is functional in a broad range of prokaryotic host cells (i.e., a normal or non- 
conditional origin of replication such as the ColEl origin and its derivatives). 

The terms "sequence-specific recombinase" and "site-specific recombinase" 
5 refer to enzymes that recognize and bind to a short nucleic acid site or sequence and 

catalyze the recombination of nucleic acid in relation to these sites. 

The terms "sequence-specific recombinase target site" and "site-specific 
recombinase target site" refer to a short nucleic acid site or sequence which is 
recognized by a sequence- or site-specific recombinase and which become the 
10 crossover regions during the site-specific recombination event. Examples of sequence- 

specific recombinase target sites include, but are not limited to, lox sites, frt sites, att 
sites and dif sites. 

The term "lox site" as used herein refers to a nucleotide sequence at which the 
product of the ere gene of bacteriophage PI, Cre recombinase, can catalyze a site- 
15 specific recombination. A variety of lox sites are known to the art including the 

naturally occurring lox? (the sequence found in the PI genome), loxB, loxL and loxR 
(these are found in the E. coli chromosome) as well as a number of mutant or variant 
lox sites such as loxPSU, /oxA86, /oxA117, loxC2 9 loxP2, lox?3, /oxP23, loxS, and 
loxH. 

20 The term "frt site" as used herein refers to a nucleotide sequence at which the 

product of the FLP gene of the yeast 2\xm plasmid, FLP recombinase, can catalyze a 
site-specific recombination. 

The term "unique restriction enzyme site" indicates that the recognition 
sequence for a given restriction enzyme appears once within a nucleic acid molecule. 

25 For example, the EcoRl site is a unique restriction enzyme site within the plasmid 

pUNI-10 (SEQ ID NO:l). 

A restriction enzyme site is said to be located "adjacent to the 3' end of a 
sequence-specific recombinase target site" if the restriction enzyme recognition site is 
located downstream of the 3' end of the sequence-specific recombinase target site. 
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The adjacent restriction enzyme site may, but need not, be contiguous with the last or 
3' nucleotide comprising the sequence-specific recombinase target site. For example, 
the EcoRl site of pUNI-10 is located adjacent (within 3 nucleotides) to the 3' end of 
the lox? site (see Figure 2B); the Xhol, Ndel, and Ncol sites are also adjacent (i.e., 
within about 10-150 nucleotides) to the lox? site but these sites are not contiguous 
with the V end of the lox? site in pUNI-10. 

The terms "polylinker" or "multiple cloning site" refer to a cluster of restriction 
enzyme sites on a nucleic acid construct which are utilized for the insertion and/or 
excision of nucleic acid sequences such as the coding region of a gene, lox sites, etc. 

The term "prokaryotic termination sequence" refers to a nucleic acid sequence 
which is recognized by the RNA polymerase of a prokaryotic host cell and results in 
the termination of transcription. Prokaryotic termination sequences commonly 
comprise a GC-rich region that has a twofold symmetry followed by an AT-rich 
sequence [Stryer, supra]. A commonly used prokaryotic termination sequence is the 
T7 termination sequence. A variety of termination sequences are known to the art and 
may be employed in the nucleic acid constructs of the present invention including, but 
not limited to, the T im , T lA , T L2 , T L3 , 7* RI , 7^, T 6S termination signals derived from the 
bacteriophage lambda [Lambda II, Hendrix et al Eds., supra] and termination signals 
derived from bacterial genes such as the trp gene of E. coli [Stryer, supra]. 

The term "eukaryotic polyadenylation sequence" (also referred to as a "poly A 
site" or "poly A sequence") as used herein denotes a DNA sequence which directs both 
the termination and polyadenylation of the nascent RNA transcript. Efficient 
polyadenylation of the recombinant transcript is desirable as transcripts lacking a poly 
A tail are unstable and are rapidly degraded. The poly A signal utilized in an 
expression vector may be "heterologous" or "endogenous." An endogenous poly A 
signal is one that is found naturally at the V end of the coding region of a given gene 
in the genome. A heterologous poly A signal is one which is isolated from one gene 
and placed 3' of another gene. A commonly used heterologous poly A signal is the 
SV40 poly A signal. The SV40 poly A signal is contained on a 237 bp BamUl/BcR 
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restriction fragment and directs both termination and polyadenylation [J. Sambrook, 
supra, at 16.6-16.7]; numerous vectors contain the SV40 poly A signal [e.g., pCEP4, 
pREP4, pEBVHis (Invitrogen)]. Another commonly used heterologous poly A signal 
is derived from the bovine growth hormone (BGH) gene; the BGH poly A signal is 

5 available on a number of commercially available vectors [e.g., pcDNA3.1, pZeoSV2, 
pSecTag (Invitrogen)]. The poly A signal from the Herpes simplex virus thymidine 
kinase (HSV tk) gene is also often used as a poly A signal on expression vectors. 
Vectors containing the HSV tk poly A signal include the pBK-CMV, pBK-RSV, and 
pOPBCAT vectors from Stratagene. 

10 As used herein, the terms "selectable marker" or "selectable marker gene" refers 

to the use of a gene which encodes an enzymatic activity that confers the ability to 
grow in medium lacking what would otherwise be an essential nutrient {e.g., the TRP1 
gene in yeast cells). In addition, a selectable marker may confer resistance to an 
antibiotic or drug upon the cell in which the selectable marker is expressed. A 

15 selectable marker may be used to confer a particular phenotype upon a host cell. 

When a host cell must express a selectable marker to grow in selective medium, the 
marker is said to be a positive selectable marker {e.g., antibiotic resistance genes which 
confer the ability to grow in the presence of the appropriate antibiotic). Selectable 
markers can also be used to select against host cells containing a particular gene {e.g., 

20 the sacB gene which, if expressed, kills the bacterial host cells grown in medium 

containing 5% sucrose and the 0>X174 E gene). Selectable markers used in this 
manner are referred to as negative selectable markers or counter-selectable markers. 

As used herein, the term "vector" is used in reference to nucleic acid molecules 
that transfer DNA segment(s) from one cell to another. The term "vehicle" is 

25 sometimes used interchangeably with "vector." A "vector" is a type of "nucleic acid 

construct." The term "nucleic acid construct" includes circular nucleic acid constructs 
such as plasmid constructs, phagemid constructs, cosmid vectors, etc. as well as linear 
nucleic acid constructs {e.g., X phage constructs and PCR products). The nucleic acid 
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construct may comprise expression signals such as a promoter and/or an enhancer (in 
such a case it is referred to as an expression vector). 

The term "expression vector" as used herein refers to a recombinant DNA 
molecule containing a desired coding sequence and appropriate nucleic acid sequences 
necessary for the expression of the operably linked coding sequence in a particular 
host organism. Nucleic acid sequences necessary for expression in prokaryotes usually 
include a promoter, an operator (optional), and a ribosome binding site, often along 
with other sequences. Eukaryotic cells are known to utilize promoters, enhancers, and 
termination and polyadenylation signals. 

The terms "in operable combination," "in operable order," and "operably 
linked" as used herein refer to the linkage of nucleic acid sequences in such a manner 
that a nucleic acid molecule capable of directing the transcription of a given gene 
and/or the synthesis of a desired protein molecule is produced. The term also refers to 
the linkage of amino acid sequences in such a manner so that a functional protein is 
produced. 

The terms "transformation" and "transfection" as used herein refer to the 
introduction of foreign DNA into prokaryotic or eukaryotic cells. Transformation of 
prokaryotic cells may be accomplished by a variety of means known to the art 
including the treatment of host cells with CaCl 2 to make competent cells, 
electroporation, etc. Transfection of eukaryotic cells may be accomplished by a 
variety of means known to the art including calcium phosphate-DNA co-precipitation, 
DEAE-dextran-mediated transfection, polybrene-mediated transfection, electroporation, 
microinjection, liposome fusion, lipofection, protoplast fusion, retroviral infection, and 
biolistics, among other means. 

As used herein, the terms "restriction endonucleases" and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a 
specific nucleotide sequence. 

As used herein, the term "recombinant DNA molecule" as used herein refers to 
a DNA molecule that comprises segments of DNA joined together by means of 
molecular biological techniques. 
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The term "recombinant protein" or "recombinant polypeptide" as used herein 
refers to a protein molecule that is expressed from a recombinant DNA molecule. 

DNA molecules are said to have "5' ends" and "3' ends" because 
mononucleotides are reacted to make oligonucleotides in a manner such that the 5' 
5 phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its 

neighbor in one direction via a phosphodiester linkage. Therefore, an end of an 
oligonucleotides is referred to as the "5' end" if its 5' phosphate is not linked to the 3' 
oxygen of a mononucleotide pentose ring and as the M 3' end" if its 3' oxygen is not 
linked to a 5 ? phosphate of a subsequent mononucleotide pentose ring. As used 
10 herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may 
be said to have 5' and 3' ends. In either a linear or circular DNA molecule, discrete 
elements are referred to as being "upstream" or 5' of the "downstream" or 3' elements. 
This terminology reflects the fact that transcription proceeds in a 5' to 3' fashion along 
the DNA strand. The promoter and enhancer elements that direct transcription of a 
15 linked gene are generally located 5' or upstream of the coding region. However, 

enhancer elements can exert their effect even when located 3' of the promoter element 
and the coding region. Transcription termination and polyadenylation signals are 
located 3' or downstream of the coding region. 

The 3' end of a promoter element is said to be located upstream of the 5' end 
20 of a sequence-specific recombinase target site when (moving in a 5' to 3' direction 

along the nucleic acid molecule) the 3' terminus of a promoter element (the 
transcription start site is taken as the 3' end of a promoter element) precedes the 5' 
end of the sequence-specific recombinase target site. The 3' end of the promoter 
element may be located adjacent (generally within about 0 to 500 bp) to the 5' end of 
25 the sequence-specific recombinase target site. Such an arrangement is used when the 

pHOST vector is not intended to permit the expression of a translational fusion with 
the gene of interest donated by a pUNI vector. Alternatively, when the pHOST vector 
is intended to permit the expression of a translational fusion, the 3' end of the 
promoter element is located upstream of both the sequences encoding the amino- 
30 terminus of a fusion protein and the 5' end of the sequence-specific recombinase target 
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site. In this case, the 5' end of the sequence-specific recombinase target site is located 
within the coding region of the fusion protein (e.g., located downstream of both the 
promoter element and the sequences encoding the affinity domain, such as Gst). 

As used herein, the phrase "an oligonucleotide having a nucleotide sequence 
encoding a gene" refers to a nucleic acid sequence comprising the coding region of a 
gene or, in other words, the nucleic acid sequence that encodes a gene product. The 
coding region may be present in either a cDNA, genomic DNA, or RNA form. When 
present in a DNA form, the oligonucleotide may be single-stranded (i.e., the sense 
strand) or double-stranded. Suitable control elements such as enhancers/promoters, 
splice junctions, polyadenylation signals, etc. may be placed in close proximity to the 
coding region of the gene if needed to permit proper initiation of transcription and/or 
correct processing of the primary RNA transcript. Alternatively, the coding region 
utilized in the vectors of the present invention may contain endogenous 
enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, 
etc. or a combination of both endogenous and exogenous control elements. 

As used herein, the term "regulatory element" refers to a genetic element that 
controls some aspect of the expression of nucleic acid sequences. For example, a 
promoter is a regulatory element that facilitates the initiation of transcription of an 
operably linked coding region. Other regulatory elements are splicing signals, 
polyadenylation signals, termination signals, etc, (defined infra). 

Transcriptional control signals in eukaryotes comprise "promoter" and 
"enhancer" elements. Promoters and enhancers consist of short arrays of DNA 
sequences that interact specifically with cellular proteins involved in transcription 
[Maniatis, T. et aL, Science 236:1237 (1987)]. Promoter and enhancer elements have 
been isolated from a variety of eukaryotic sources including genes in yeast, insect, and 
mammalian cells and viruses (analogous control elements, i.e., promoters, are also 
found in prokaryotes). The selection of a particular promoter and enhancer depends on 
what cell type is to be used to express the protein of interest. Some eukaryotic 
promoters and enhancers have a broad host range while others are functional in a 
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limited subset of cell types [for review, see Voss, S.D. et al, Trends Biochem. Sci., 
11:287 (1986) and Maniatis, T. et al, supra (1987)]. For example, the SV40 early 
gene enhancer is very active in a wide variety of cell types from many mammalian 
species and has been widely used for the expression of proteins in mammalian cells 
[Dijkema, R. et al, EMBO J. 4:761 (1985)]. Two other examples of 
promoter/enhancer elements active in a broad range of mammalian cell types are those 
from the human elongation factor la gene [Uetsuki, T. et al, J. Biol Chem., 
264:5791 (1989), Kim, D.W. et al, Gene 91:217 (1990) and Mizushima, S. and 
Nagata, S., Nuc. Acids. Res., 18:5322 (1990)] and the long terminal repeats of the 
Rous sarcoma virus [Gorman, CM. et al, Proc. Natl. Acad. Sci. USA 79:6777 (1982)] 
and the human cytomegalovirus [Boshart, M. et al, Cell 41:521 (1985)]. 

As used herein, the term "promoter/enhancer" denotes a segment of DNA that 
contains sequences capable of providing both promoter and enhancer functions (i.e., 
the functions provided by a promoter element and an enhancer element, see above for 
a discussion of these functions). For example, the long terminal repeats of retroviruses 
contain both promoter and enhancer functions. The enhancer/promoter may be 
"endogenous" or "exogenous" or "heterologous." An "endogenous" enhancer/promoter 
is one which is naturally linked with a given gene in the genome. An "exogenous" or 
"heterologous" enhancer/promoter is one which is placed in juxtaposition to a gene by 
means of genetic manipulation (i.e., molecular biological techniques) such that 
transcription of that gene is directed by the linked enhancer/promoter. 

The presence of "splicing signals" on an expression vector often results in 
higher levels of expression of the recombinant transcript. Splicing signals mediate the 
removal of introns from the primary RNA transcript and consist of a splice donor and 
acceptor site [Sambrook, J. et al, Molecular Cloning: A Laboratory Manual, 2nd ed., 
Cold Spring Harbor Laboratory Press, New York (1989) pp. 16.7-16.8]. A commonly 
used splice donor and acceptor site is the splice junction from the 16S RNA of SV40. 

Eukaryotic expression vectors may also contain "viral replicons" or "viral 
origins of replication." Viral replicons are viral DNA sequences that allow for the 
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extrachromosomal replication of a vector in a host cell expressing the appropriate 
replication factors. Vectors that contain either the SV40 or polyoma virus origin of 
replication replicate to high copy number (up to 10 4 copies/cell) in cells that express 
the appropriate viral T antigen. Vectors that contain the replicons from bovine 
papillomavirus or Epstein-Barr virus replicate extrachromosomally at low copy number 
(-100 copies/cell). 

As used herein, the terms "nucleic acid molecule encoding;' "DNA sequence 
encoding," and "DNA encoding" refer to the order or sequence of 
deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these 
deoxyribonucleotides determines the order of amino acids along the polypeptide 
(protein) chain. The DNA sequence thus codes for the amino acid sequence. 

As used herein, the term "gene" means the deoxyribonucleotide sequences 
comprising the coding region of a structural gene and the including sequences located 
adjacent to the coding region on both the 5' and 3' ends such that the gene 
corresponds to the length of the full-length mRNA. The sequences that are located 5' 
of the coding region and which are present on the mRNA are referred to as 5' non- 
translated sequences. The sequences that are located 3' or downstream of the coding 
region and which are present on the mRNA are referred to as 3' non-translated 
sequences. The term "gene" encompasses both cDNA and genomic forms of a gene. 
A genomic form or clone of a gene contains the coding region interrupted with non- 
coding sequences termed "introns" or "intervening regions" or "intervening sequences." 
Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns 
may contain regulatory elements such as enhancers. Introns are removed or "spliced 
out" from the nuclear or primary transcript. Introns therefore are absent in the 
messenger RNA (mRNA) transcript. The mRNA functions during translation to 
specify the sequence or order of amino acids in a nascent polypeptide. When a gene is 
altered such that its product is no longer biologically active in a wild-type fashion, the 
mutation is referred to as a "loss-of-function" mutation. When a gene is altered such 
that a portion or the entirety of the gene is deleted or replaced, the mutation is referred 
to as a "knockout" mutation. 
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In addition to containing introns, genomic forms of a gene may also include 
sequences located on both the 5' and 3' end of the sequences that are present on the 
RNA transcript. These sequences are referred to as "flanking" sequences or regions 
(these flanking sequences are located 5' or 3' to the non-translated sequences present 

5 on the mRNA transcript). The 5' flanking region may contain regulatory sequences 

such as promoters and enhancers that control or influence the transcription of the gene. 
The 3' flanking region may contain sequences that direct the termination of 
transcription, post-transcriptional cleavage, and polyadenylation. 

As used herein, the term "purified" or "to purify" refers to the removal of 

10 contaminants from a sample. For example, recombinant Cre polypeptides are 
expressed in bacterial host cells (e.g., as a Gst-Cre fusion protein) and the Cre 
polypeptides are purified by the removal of at least a portion of the host cell proteins; 
the percent of recombinant Cre polypeptides is thereby increased in the sample. 

The term "native protein" is used herein to indicate that a protein does not 

15 contain amino acid residues encoded by vector sequences; that is the native protein 

contains only those amino acids found in the protein as it occurs in nature. A native 
protein may be produced by recombinant means or may be isolated from a naturally 
occurring source. 

As used herein the term "portion" when in reference to a protein (as in "a 
20 portion of a given protein") refers to fragments of that protein. The fragments may 

range in size from four amino acid residues to the entire amino acid sequence minus 
one amino acid. 

As used herein, the term "fusion protein" refers to a chimeric protein containing 
the protein of interest (e.g., the Cre protein) joined to an exogenous protein fragment 

25 (e.g., the fusion partner which consists of non-Cre protein sequences). The fusion 

partner may enhance solubility of the protein of interest as expressed in a host cell, 
may provide an affinity tag to allow purification of the recombinant fusion protein 
from the host cell or culture supernatant, or both, among other desired characteristics. 
If desired, the fusion protein may be removed from the protein of interest by a variety 

30 of enzymatic or chemical means known to the art. 
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DESCRIPTION OF THE INVENTION 

The present invention provides compositions and methods that comprise a 
system for the rapid subcloning of nucleic acid sequences in vivo and in vitro without 
the need to use restriction enzymes. This system is referred to as the Univector Fusion 

5 System or Univector Plasmid-fusion System (UPS). The UPS employs site-specific 

recombination to catalyze plasmid fusion between a Univector (i.e., a plasmid 
containing a gene of interest) and host vectors containing regulatory information. In 
some embodiments of the present invention, plasmid fusion events are genetically 
selected and result in placement of the gene of interest under the control of novel 

10 regulatory elements. A second UPS-related method of the present invention allows for 
the precise transfer of coding sequences alone from a Univector into a host vector. 
UPS further provides means for the subcloning of entire nucleic acid libraries and the 
directional cloning of linear nucleic acid molecules (e.g., PCR products). 

The UPS offers many advantages over previously available technologies for the 

15 manipulation of genes. For example, for a routine analysis of a new gene, it may be 

desirable to express it in bacteria as a glutathione-S-transferase (Gst) or polyhistidine 
fusion for purification and antibody production, to fuse it to the DNA-binding domain 
of GAL4 or lexA for two hybrid analysis, to express it from the T7 promoter to allow 
generation of a riboprobe or mRNA for in vitro transcription and translation, and 

20 express it in baculovirus, all in the course of a single study. One might also wish to 

express the gene under the regulation of different promoters in a variety of organisms 
or to mark it with different epitope tags to facilitate subsequent biochemical or 
immunological analysis. All of these manipulations consume significant amounts of 
time and energy using previous available technologies for two reasons. First, each of 

25 the different vectors required for these studies were, for the most part, developed 

independently and thus contain different sequences and restriction sites for insertion of 
genes. Therefore, genes must be individually tailored to adapt to each of these 
vectors. Secondly, the DNA sequence of any given gene varies and can contain 
internal restriction sites that make it incompatible with particular vectors, thereby 
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complicating manipulation. The advent of the polymerase chain reaction (PCR) has 
greatly facilitated the alteration of gene sequences and creation of compatible 
restriction sites for subcloning purposes. However, the high error rate of thermostable 
polymerases requires the sequence of each PCR-derived DNA fragment to be verified, 

5 a time consuming process. 

The availability of whole genome sequences now provides the opportunity to 
analyze large sets of genes for both genetic and biochemical properties. The need to 
perform parallel processing of large gene sets exponentially amplifies the current 
defects associated with conventional cloning methods. The methods and compositions 

10 of the present invention provide a series of recombination-based approaches that 

significantly reduce the time and effort involved in generating multiple transcriptional 
and translational fusions for gene analysis and cDNA library construction. The present 
invention provides a system whereby a gene can be placed under the control of any of 
a variety of promoters or fused in frame to other proteins or peptides without the use 

15 of restriction enzymes. As discussed above, the UPS uses site-specific recombination 

to fuse two plasmids at a unique sequence adjacent to both a regulatory region and the 
5' end of the gene or interest, thereby placing the gene under new regulation. This 
system, together with the other methods and compositions of the present invention 
discussed herein, provide a multifaceted approach for the rapid and efficient generation 

20 and manipulation of recombinant DNA, thus making possible parallel processing of 

whole genome sets of coding sequences. 

The basis of the UPS is a vector termed the "Univector" or the "pUNI" vector 
into which sequences encoding a gene of interest (cDNA or genomic) are inserted. 
The pUNI vector has a sequence-specific recombinase target site, such as a loxV site, 

25 preceding the insertion site for the gene of interest, a selectable marker gene (this 

feature is optional) and a conditional origin of replication that is active only in host 
cells expressing the requisite trans-acting replication factor (this feature is optional). 
The pUNI vectors are designed to contain a gene of interest but lack a promoter for 
the expression of the gene of interest. The gene of interest may be cloned directly into 

30 the pUNI vector (/.<?., the pUNI vector may be used as a cloning vector, particularly 
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for the cloning of cDNA libraries) or a previously cloned gene of interest may be 
inserted (i.e., subcloned) into the pUNI vector. 

Using a sequence-specific recombinase (e.g., Cre recombinase), a precise fusion 
of the pUNI vector into a second vector containing another sequence-specific 

5 recombinase target site is catalyzed. The second vector, referred to generically as a 

"pHOST" vector, is a vector (e.g., expression vector) that contains the sequence- 
specific recombinase target site downstream of regulatory element (e.g., a promoter) 
contained within the pHOST vector. Following the site-specific recombination event 
which occurs between the single sequence-specific recombinase target sites located on 

10 each vector (e.g., the pUNI vector and the pHOST vector), the two vectors are stably 
fused in a manner that places the gene of interest under the control of the regulatory 
element contained within the pHOST vector. When used for transfer into an 
expression vector, this fusion event also occurs in a manner that retains the proper 
translational reading frame of the gene of interest. 

1 5 In some embodiment of the present invention, the fusion or recombination 

event can be selected for by selecting for the ability of host cells, which do not express 
a trans-acting replication factor required for replication of a conditional origin 
contained on the pUNI vector, to acquire a selectable phenotype conferred by the 
selectable marker gene (if present) on the pUNI vector. In these embodiments, the 

20 pUNI vector cannot replicate in cells that do not express the trans-acting replication 

factor and therefore, unless the pUNI vector has integrated into the second vector that 
contains a non-conditional origin of replication, pUNI will be lost from the host cell. 

The Univector Fusion System allows any number of expression or fusion 
constructs containing the gene of interest present on the pUNI vector to be made 

25 rapidly (e.g., within a single day). Using conventional cloning or subcloning 

techniques which employ restriction enzyme digestion(s), the production of a single 
expression vector containing a gene of interest can take several days (i.e., for the 
design and construction of each expression vector). In contrast, with the methods and 
compositions of the present invention, once a battery of expression vectors modified to 



- 23 - 




contain the appropriate sequence- specific recombinase target site is made, a gene of 
interest can be transferred to any number of expression vectors in an afternoon using 
the Univector Fusion System. For example, Figure 1 provides a schematic illustrating 
the straightforward recombination methods of the pUNI vectors and the Univector 

5 Fusion System. 

The present invention further provides methods and compositions for 
directional subcloning of PCR fragments and other nucleic acid molecules into 
Univectors or other vectors and methods and compositions for generation of epitope 
tags and other fusions at the 3' end of open reading frames using homologous 

10 recombination. 

In general, UPS can be used to fuse any coding region of interest either with a 
specific promoter to gain novel transcriptional regulation, with another coding 
sequence to produce a fusion protein with novel properties (e.g., an epitope tag for 
immunological detection or a DNA binding domain or transcriptional activation 

15 domain for two hybrid analysis), or with any other desired regulatory element. As 

discussed above, the UPS eliminates the need for restriction enzymes, DNA ligases, 
and many in vitro manipulations required for subcloning. This relieves the constraints 
on cloning vectors with respect to DNA sequence and size since the UPS reaction is 
independent of vector size or sequence. Furthermore, the time-consuming processed 

20 inherent in conventional cloning such as the identification of a suitable vector, 

designing a cloning strategy, restriction endonuclease digestion, agarose gel 
electrophoresis, isolation of DNA fragments, and the ligation reaction is shortened to a 
20 minute UPS reaction. Due to the uniform nature of the UPS reaction and its 
simplicity, dozens of constructs can be made simultaneously by simply using different 

25 recipient vectors. In addition, in contrast to restriction enzymes and DNA ligases, 

recombinases (e.g., Gst-Cre) can be made inexpensively in large quantities. These 
features will save investigators significant amounts of time and expense. 

Together, these methods constitute a comprehensive recombinational strategy 
for the generation and manipulation of recombinant DNA that can be used for the 

30 parallel processing of gene sets, an ability required for genomic analyses. 
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a) Conditional Origins of Replication and Suitable Host Cells 

In some embodiments of the present invention, the pUNI vector comprises a 
conditional origin of replication. Conditional origins of replication are origins that 
require the presence or expression of a trans-acting factor in the host cell for 
replication. A variety of conditional origins of replication functional in prokaryotic 
hosts (e.g., E. coli) are known to the art. The present invention is illustrated with, but 
not limited by, the use of the R6Ky origin, oriR, from the plasmid R6K. The R6Ky 
origin requires a trans-acting factor, the 11 protein supplied by the pir gene [Metcalf et 
al (1996) Plasmid 35:1]. E. coli strains containing the pir gene will support 
replication of R6Ky origins to medium copy number. A strain containing a mutant 
allele of pir, pir-\\6, will allow an even higher copy number of constructs containing 
the R6Ky origin (i.e., 15 copies per cell for the wild type versus 250 copies per cell 
for the mutant). This property may be useful when potentially toxic genes are 
manipulated, although the chances of expression of a toxic gene are low because, in 
preferred embodiments of the present invention, the Univector either contains no 
promoter or contains a promoter driving the neo gene which is transcribed in the 
opposite direction from the gene of interest. 

E. coli strains that express the pir or pir A 16 gene product include BW18815 
(ATCC 47079; this strain contains the pir-116 gene), BW19094 (ATCC 47080; this 
strain contains the pri gene), BW20978 (this strain contains the pir-116 gene), 
BW20979 (this strain contains the pir gene), BW21037 (this strain contains the/?/r-116 
gene) and BW21038 (this strain contains the pir gene) (Metcalf et al, supra). 

Other conditional origins of replication suitable for use on the pUNI vectors of 
the present invention include, but are not limited to: 

1) the RK2 oriV from the plasmid RK2 (ATCC 37125). The RK2 oriV 
requires a trans-acting protein encoded by the trfA gene [Ayres et al 
(1993) J. Mol Biol 230:174]; 

2) the bacteriophage PI ori which requires the repA protein for replication 
[Pal et al (1986) J. Mol Biol 192:275]; 
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3) the origin of replication of the plasmid pSClOl (ATCC 37032) which 
requires a plasmid encoded protein, repA, for replication [Sugiura et al 
(1992) 1 Bacteriol 175: 5993]. The pSClOl ori also requires IHF, an 
E. coli protein. E. coli strains carrying the himA and himD (hip) 
mutants (the him and hip genes encode subunits of IHF) cannot support 
pSClOl replication [Stenzel et al (1987) Cell 49:709]; 

4) the bacteriophage lambda ori which requires the lambda O and P 
proteins [Lambda II Hendrix et al Eds., Cold Spring Harbor Press, 
Cold Spring Harbor, NY (1983)]; 

5) pBR322 and other ColEl derivatives will not replicate in polA mutants 
of E. coli and therefore, these origins of replication can be used in a 
conditional manner [Grindley and Kelley (1976) Mol Gen. Genet 
143:311]; and 

6) replication-thermosensitive plasmids such pSU739 or pSU300 which 
contain a thermosensitive replicon derived from plasmid pSClOl, rep 
pSClOl 15 which comprises oriV [Mendiola and de la Cruz (1989) Mol 
Microbiol 3:979 and Francia and Lobo (1996) 1 Bad. 178:894]. 
pSU739 and pSU300 are stably maintained in E. coli strain DH5a 
(Gibco BRL) at a growth temperature of 30°C (42°C is non-permissive 
for replication of this replicon). 

Other conditional origins of replication, including other temperature sensitive 
replicons, are known to the art and may be employed in the vectors and methods of 
the present invention. 

b) Sequence-Specific Recombinases And Target Recognition Sites 

The precise fusion between the pUNI vector and the expression vector is 
catalyzed by a site-specific recombinase. Site-specific recombinases are enzymes that 
recognize a specific DNA site or sequence (referred to herein generically as a 
"sequence-specific recombinase target site") and catalyze the recombination of DNA in 
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relation to these sites. Site-specific recombinases are employed for the recombination 
of DNA in both prokaryotes and eukaryotes. Examples of site-specific recombination 
include, but are not limited to: 1) chromosomal rearrangements that occur in 
Salmonella iyphimurium during phase variation, inversion of the FLP sequence during 

5 the replication of the yeast 2\im circle, and in the rearrangement of immunoglobulin 

and T cell receptor genes in vertebrates, 2) integration of bacteriophages into the 
chromosome of prokaryotic host cells to form a lysogen, and 3) transposition of 
mobile genetic elements (e.g., transposons) in both prokaryotes and eukaryotes. The 
term "site-specific recombinase" refers to enzymes that recognize short DNA sequences 

10 that become the crossover regions during the recombination event and includes 

recombinases, transposases, and integrases. 

The present invention is illustrated with, but not limited by, the use of vectors 
containing lox sites (e.g., lox? sites) and the recombination of these vectors using the 
Cre recombinase of bacteriophage PI. The Cre protein catalyzes recombination of 

15 DNA between two lox? sites and is involved in the resolution of PI dimers generated 

by replication of circular lysogens [Sternberg et al (1981) Cold Spring Harbor Symp. 
Quant. Biol. 45:297]. Cre can function in vitro and in vivo in many organisms 
including, but not limited to, bacteria, fungi, and mammals [Abremski et al (1983) 
Cell 32:1301; Sauer (1987) Mol Cell Biol 7:2087; and Orban et al (1992) Proc. 

20 Natl Acad. Sci. 89:6861]. A schematic for one embodiment of Cre-mediated plasmid 

fusion is shown in Figure 14. In this figure, the Univector, pUNI, is the plasmid into 
which the gene of interest is inserted and pHOST represents the recipient vector that 
contains the appropriate transcriptional and/or translational regulatory sequences that 
will eventually control the expression of the gene of interest. A recombinant 

25 expression construct is made through Cre-/axP-mediated site-specific recombination 

that fuses these two plasmids. This in vitro reaction generates a dimeric recombinant 
plasmid in which the gene of interest from pUNI is placed downstream of the 
promoter present on the host vector. In this example, the recombinant plasmid in 
Figure 14 can be selected in a pif bacterial strain by selecting Kn r . 
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The lox? sites may be present on the same DNA molecule or they may be 
present on different DNA molecules; the DNA molecules may be linear or circular or 
a combination of both. The lox? site consists of a double-stranded 34 bp sequence 
(SEQ ID NO: 12) which comprises two 13 bp inverted repeat sequences separated by 

5 an 8 bp spacer region [Hoess et al. (1982) Proa Natl. Acad Sci. USA 79:3398 and 

U.S. Patent No. 4,959,317, the disclosure of which is herein incorporated by 
reference]. The internal spacer sequence of the lox? site is asymmetrical and thus, two 
lox? sites can exhibit directionality relative to one another [Hoess et al (1984) Proa 
Natl. Acad Set USA 81:1026]. When two lox? sites on the same DNA molecule are 

10 in a directly repeated orientation, Cre excises the DNA between these two sites leaving 

a single lox? site on the DNA molecule [Abremski et al (1983) Cell 32:1301]. If two 
lox? sites are in opposite orientation on a single DNA molecule, Cre inverts the DNA 
sequence between these two sites rather than removing the sequence. Two circular 
DNA molecules each containing a single lox? site will recombine with one another to 

15 form a mixture of monomer, dimer, trimer, etc. circles. The concentration of the DNA 

circles in the reaction can be used to favor the formation of monomer (lower 
concentration) or multimeric circles (higher concentration). 

Circular DNA molecules having a single lox? site will recombine with a linear 
molecule having a single lox? site to produce a larger linear molecule. Cre interacts 

20 with a linear molecule containing two directly repeating lox? sites to produce a circle 
containing the sequences between the lox? sites and a single lox? site and a linear 
molecule containing a single lox? site at the site of the deletion. 

The Cre protein has been purified to homogeneity [Abremski et al (1984) J. 
Mol Biol 259:1509] and the cre gene has been cloned and expressed in a variety of 

25 host cells [Abremski et al (1983), supra]. Purified Cre protein is available from a 
number of suppliers (e.g., Novagen and New England Nuclear/DuPont). 

The Cre protein also recognizes a number of variant or mutant lox sites (variant 
relative to the lox? sequence), including the loxB 9 loxh and loxR sites which are found 
in the E. coli chromosome [Hoess et al (1982), supra]. Other variant lox sites include 
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/<wtP511 [5 ' - AT AACTTCGT AT AGTATACATT AT ACG A AGTTAT-3 ' (SEQ ID 
NO: 16); spacer region underlined; Hoess et al. (1986), supra], and loxQl [5'-ACAAC 
TTCGTAT A ATGTATGC TATACGAAGTTAT-3 ' (SEQ ID NO: 17); spacer region 
underlined; U.S. Patent No. 4,959,317]. Cre catalyzes the cleavage of the lox site 

5 within the spacer region and creates a six base-pair staggered cut [Hoess and Abremski 
(1985) J. Mol. Biol. 181:351]. The two 13 bp inverted repeat domains of the lox site 
represent binding sites for the Cre protein. If two lox sites differ in their spacer 
regions in such a manner that the overhanging ends of the cleaved DNA cannot 
reanneal with one another, Cre cannot efficiently catalyze a recombination event using 

10 the two different lox sites. For example, it has been reported that Cre cannot 

recombine (at least not efficiently) a lox? site and a /oxP511 site; these two lox sites 
differ in the spacer region. Two lox sites which differ due to variations in the binding 
sites (i.e., the 13 bp inverted repeats) may be recombined by Cre provided that Cre can 
bind to each of the variant binding sites. The efficiency of the reaction between two 

15 different lox sites (varying in the binding sites) may be less efficient that between two 
lox sites having the same sequence (the efficiency will depend on the degree and the 
location of the variations in the binding sites). For example, the loxC2 site can be 
efficiently recombined with the lox? site, as these two lox sites differ by a single 
nucleotide in the left binding site. 

20 A variety of other site-specific recombinases may be employed in the methods 

of the present invention in place of the Cre recombinase. Alternative site-specific 
recombinases include, but are not limited to: 

1) the FLP recombinase of the 2\i plasmid of Saccharomyces cerevisiae 
[Cox (1983) Proc. Natl. Acad. Sci. USA 80:4223] which recognizes the 

25 fit site. Like the lox? site, the fit site comprises two 13 bp inverted 

repeats separated by an 8 bp spacer 

[5 '-GAAGTTCCTATTCTCTAGAAAGTATAGGAACTTC-3 ' (SEQ ID 
NO: 18); spacer underlined]. The FLP gene has been cloned and 
expressed in E. coli (Cox, supra) and in mammalian cells (PCT 



- 29 - 



International Patent Application PCT/US92/01899, Publication No.: 
WO 92/15694, the disclosure of which is herein incorporated by 
reference) and has been purified [Meyer-Lean et al. (1987) Nucleic 
Acids Res. 15:6469; Babineau et al. (1985) J. Biol. Chem. 260:12313; 
and Gronostajski and Sadowski (1985) J. Biol. Chem. 260:12328]; 

2) the Int recombinase of bacteriophage lambda (with or without Xis) 
which recognizes att sites (Weisberg et al. In: Lambda II, supra, pp. 
211-250); 

3) the xerC and xerD recombinases of E. coli which together form a 
recombinase that recognizes the 28 bp J// site [Leslie and Sherratt 
(1995) EMBOJ. 14:1561]; 

4) the Int protein from the conjugative transposon Tn916 [Lu and 
Churchward (1994) EMBO J. 13:1541]; 

5) Tpnl and the (3-lactamase transposons [Levesque (1990) J. Bacteriol. 
172:3745]; 

6) the Tn3 resolvase [Flanagan et al. (1989) J. Mol. Biol. 206:295 and 
Stark et al. (1989) Cell 58:779]; 

7) the SpoIVC recombinase of Bacillus subtilis [Sato et al. (1990) J. 
Bacteriol. 172:1092]; 

8) the Hin recombinase [Galsgow et al. (1989) J. Biol. Chem. 264:10072]; 

9) the Cin recombinase [Hafter et al. (1988) EMBO J. 7:3991]; and 

10) the immunoglobulin recombinases [Malynn et al. Cell (1988) 54:453]. 

c) Modification of Expression Vectors 

As discussed above, pUNI vectors are used to transfer a gene of interest into a 
suitably modified vector via site-specific recombination. The modified vectors or host 
vectors used in the Univector Fusion System are referred to as pHOST vectors. 
pHOST vectors are generally expression vectors {e.g., plasmids) which have been 
modified by the insertion of a sequence- specific recombinase target site (e.g., a lox 
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site). However, the pHOST can comprise any regulatory sequence desired for 
manipulation of nucleic acids. The presence of the sequence-specific recombinase 
target site on the pHOST plasmid permits the rapid subcloning or insertion of the gene 
interest contained within a pUNI vector to generate an expression vector capable of 
expressing the gene of interest. In some embodiments of the present invention, the 
pHOST vector may encode a protein domain such as an affinity domain including, but 
not limited to, glutathione-S-transferase (Gst), maltose binding protein (MBP), a 
portion of staphylococcal protein A (SPA), a polyhistidine tract, etc. A variety of 
commercially available expression vectors encoding such affinity domains are known 
to the art. The affinity domain may be located at either the amino- or carboxy- 
terminus of the fusion protein. When the pHOST plasmid contains a vector-encoded 
affinity domain, a fusion protein comprising the vector-encoded affinity domain and 
the protein of interest is generated when the pUNI and pHOST vectors are 
recombined. 

To generate expression vectors intended to generate transcriptional fusions {i.e., 
pHOST does not contain a vector-encoded protein domain), a sequence-specific 
recombinase target site is placed after (i.e., downstream of) the start of transcription in 
the host vector. This is easily accomplished using synthetic oligonucleotides 
comprising the desired sequence-specific recombinase target site. In designing the 
oligonucleotide comprising the sequence-specific recombinase target site, care is taken 
to avoid introducing an ATG or start codon that might initiate translation 
inappropriately. 

To generate expression vectors intended to generate a fusion protein between a 
vector-encoded protein domain located at the amino-terminus of the fusion protein and 
the protein of interest (encoded by the gene of interest contained within the pUNI 
vector) (i.e. 9 a translation^ fusion), care is taken to place the sequence-specific 
recombinase target site in the correct reading frame such that: 1) an open reading 
frame is maintained through the sequence-specific recombinase target site on pHOST, 
and 2) the open reading frame in the sequence-specific recombinase target site on 
pHOST is in frame with the open reading frame found on the sequence-specific 
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recombinase target site contained within the pUNI vector. In addition, the 
oligonucleotide comprising the sequence-specific recombinase target site on pHOST is 
designed to avoid the introduction of in-frame stop codons. The gene of interest 
contained within the pUNI vector is cloned in a particular reading frame so as to 

5 facilitate the creation of the desired fusion protein. 

The modification of several expression vectors is provided in the examples 
below to illustrate the creation of suitable pHOST vectors. At present, approximately 
40 pHOST vectors have been generated, including GST expression vectors, yeast 
GAL1 expression vectors, mammalian CMV expression vectors, and baculovirus 

10 expression vectors. In each case, expression was at or near the levels achieved by 
conventional cloning. A general strategy for generating any pHOST of interest 
involves the generation of a linker containing the desired sequence- specific 
recombinase target site (e.g., a lox site such as loxP or loxH) by annealing two 
complementary oligonucleotides. The annealed oligonucleotides form a linker having 

15 sticky ends that are compatible with ends generated by restriction enzymes whose sites 

are conveniently located in the parental expression vector (e.g., within a polylinker of 
the parental expression vector). Thus, any vector can be easily adapted for use with 
the UPS method. 

d) In Vitro Recombination 

20 The fusion of a pUNI vector and a pHOST vector is accomplished in vitro 

using a purified preparation of a site-specific recombinase (e.g., Cre recombinase). 
The pUNI vector and the pHOST vector are placed in reaction vessel (e.g., a 
microcentrifuge tube) in a buffer compatible with the site-specific recombinase to be 
used. For example, when a Cre recombinase (native or a fusion protein form) is 

25 employed, the reaction buffer may comprise 50 mM Tris-HCl (pH 7.5), 10 mM 

MgCl 2 , 30 mM NaCl and 1 mg/ml BSA. When a FLP recombinase is employed, the 
reaction buffer may comprise 50 mM Tris-HCl (pH 7.4), 10 mM MgCl 2 , 100 jig/ml 
BSA [Gronostajski and Sadowski, supra]. The concentration of the pUNI vector and 
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the pHOST vector may vary between 100 ng to 1.0 jag of each vector per 20 jil 
reaction volume with about 0.1 jag of each nucleic acid construct (0.2 jug total) per 20 
]xl reaction being preferred. The concentration of the site-specific recombinase may be 
titered under a standard set of reaction conditions to find the optimal concentration of 

5 enzyme to be used as described in Example 4. 

Following the in vitro fusion reaction, a portion of the reaction mixture is used 
to transform a suitable host cell to permit the recovery and propagation of the fused 
vectors. In some embodiments of the present invention, the host cell employed will 
not express the trans-acting factor required for replication of the conditional origin of 

10 replication contained within the pUNI vector (or alternatively the host cell will be 
grown at a temperature which is non-permissive for replication of a temperature 
sensitive replicon contained within the pUNI vector). The host cells will be grown 
under conditions that select for the presence of the selectable marker contained within 
the pUNI vector (e.g., growth in the presence of kanamycin when the pUNI vector 

15 contains a kanamycin resistance gene). Plasmid or non-chromosomal DNA is isolated 

from host cells which display the desired phenotype and subjected to restriction 
enzyme digestion to confirm that the desired fusion event has occurred. 

e) Recombination in Prokaryotic Host Cells 

The fusion of a pUNI vector and a pHOST vector may be accomplished in vivo 
20 using a host cell that expresses the appropriate site-specific recombinase (e.g., Cre 

recombinase). The host cell may express the recombinase as part of its genome or 
may be supplied with means for expressing the recombinase (e.g., a recombinase 
expression vector). In embodiments of the present invention that employ a pUNI 
vector with a conditional origin of replication, the host cell employed lack the ability 
25 to express the trans-acting factor required for replication of the conditional origin of 

replication (or alternatively the host cell will be grown at a temperature which is non- 
permissive for replication of a temperature sensitive replicon contained within the 
pUNI vector). 
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The pUNI vector and the pHOST vector are cotransformed into the host cell 
using a variety of methods known to the art (e.g., transformation of cells made 
competent by treatment with CaCl 2 , electroporation, etc.). The cotransformed host 
cells are grown under conditions that select for the presence of the selectable marker 
5 contained within the pUNI vector (e.g., growth in the presence of kanamycin when the 

pUNI vector contains the kanamycin resistance gene). Plasmid or non-chromosomal 
DNA is isolated from host cells which display the desired phenotype and subjected to 
restriction enzyme digestion to confirm that the desired fusion event has occurred. 

f) Precise ORF Transfer (POT) 

10 UPS results in the fusion of two plasmids and is suitable for the vast majority 

of expression needs. In rare cases where the size of the recombinant molecule is 
limiting (e.g., in the generation of retrovirus or adeno-associated viral [AAV] 
expression constructs), it might be desirable to transfer only the gene of interest and 
not the approximately 2 kb remainder of the Univector. To accomplish this, a second 

15 recombination event is utilized. In some embodiments of the present invention, this 

second recombination is catalyzed by the R recombinase [Araki et al (1992) J. Mol 
Biol 225:25] that allows a resolution of the UPS generated heterodimer as described in 
Example 9, although a variety of second recombinases will find use with the present 
invention (e.g., the Res system). POT function in vivo and in vitro. It is 

20 recommended that POT only be used in those cases where size is a limitation. 

In some embodiments of the present invention, a standard UPS method is 
utilized to generate a dimer containing the entire pUNI and pHOST vectors, followed 
by a reaction with the second recombinase that excises the unwanted portions of the 
Univector. Alternatively, host cells or reaction conditions can be applied that allow 

25 both recombination reactions to occur in a single step (See Example 9). Cells 

containing the desired recombinant product can be selected for by using selectable 
markers, and/or conditional origins of replication. 
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g) Generation of 3' Gene Fusions on the Univector 

While UPS greatly facilitates the generation of fusion proteins at the N- 
terminus of the protein of interest, it is often necessary to modify proteins on the C- 
terminus (e.g., to add an epitope tag). To facilitate this class of modification, the 

5 present invention takes advantage of E. coWs endogenous homologous recombination 
system. It has been shown [Winans et al. (1985) J. Bacteriol 161:1219] that E. coli 
strains mutant for recBC, but containing a suppressor sbc, could take up linear DNA 
and recombine it onto the E. coli chromosome or resident plasmids, much as has been 
shown for S. cerevisiae. recD mutants have been shown to behave in a similar manner 

10 [Russell et al. (1989) J. Bacteriol. 171:2609]. However, such systems have not been 

used for recombinant cloning in E. coli. In fact, these systems are incompatible with 
many cloning protocols, as the endogenous restriction modification systems of the cell 
would digest the samples to be cloned. 

The present invention provides means to overcome these problems and to 

15 provide for effective cloning and recombination (e.g., with the UPS). To facilitate 

recombination onto Univector plasmids, the present invention provides BUN 10, a 
recBCsbcBhsdR strain expressing pir-\ 16. The hsdR mutation prevents restriction of 
nucleic acid (e.g., PCR amplified DNA) by the endogenous restriction modification 
system of E. coli. In one embodiment of the present invention, this system was tested 

20 using a 3xMYC epitope tag and the SKPX gene in pUNI-10 as the recipient. pML74, 
which is pUNI-Amp containing a triple (3x) MYC epitope tag followed by a stop 
codon, was used as template DNA for PCR amplification with two primers, A and B. 
Primer A (SEQ ID NO:30) is 71 nt long, the first 50 nt of which correspond to the 
last 50 nt of the SKPX coding region and the last 21 nt, the 3' end of the primer, 

25 correspond to the first 21 nt of the DNA encoding the 3xMYC tag. The reading 

frames of SKPX and the 3xMYC tag are in register. Primer B (SEQ ID NO:31) is 22 
nt long and recognizes a site on pML74 common to pUNI vectors that begins 367 bp 
from the polylinker region. Amplification using primers A and B and pML74 as a 
template generated a fragment of DNA with 50 bp homology to the Univector. This 
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amplification product was co-transformed with BamHl-Sacl -cleaved pUNI-SKPl into 
BUN 10 cells and Kn r transformants were selected and analyzed by restriction mapping. 
Homologous recombination events are selected because they allow the recircularization 
of the linearized vector. A schematic representation of this method is provided in 
5 Figure 25. Ten percent of Kn r transformants resulted in homologous recombination at 
the C-terminus of the SKPl gene to generate a SKPl-3xMYC tag. This experiment 
demonstrates that homologous recombination in E. coli can be used to alter the 
sequence of genes in 3' regions adjacent to restriction sites. 

Furthermore, it is clear that this method is generally applicable to broader 

10 cloning strategies. Although the example above describes the use of an amplification 
product for recombination into the pUNI vector, any nucleic acid sample with 
sufficient sequence complementarity can be used. Thus, the sample to be inserted 
could be artificially synthesized or prepared by any other means. Additionally, the 
recombination event can be designed to occur at any desired location on any desired 

15 recipient vector (i.e. 9 is not limited to the production of 3' gene fusions). 

h) Method for Directional Subcloning into pUNI Vectors 

When cloning blunt ended nucleic acid molecules, such as those generated by 
thermostable polymerases, it is desirable to have a way of identifying desired 
recombinant molecules (e.g., vectors containing the insert in a desired orientation). 

20 This is of great relevance to the UPS because the initial cloning of genes into pUNI 

will often utilize PCR amplified material. To facilitate this process, the present 
invention provides a method for directional subcloning into vectors (e.g., pUNI 
derivatives) that relies upon the generation of a reconstituted regulatory element from 
two partial sites located on the fragment to be cloned and the recipient vector, 

25 respectively. For example, a linear nucleic acid molecule to be inserted into a vector 

can be designed with a portion of a promoter at its 3' or 5' ends. The recipient vector 
is then designed with the remainder of the promoter, arranged such that, when the 
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cloned fragment is inserted in the desired direction, an intact promoter is reconstituted 
and provides a means of detecting the successful directional cloning event. 

It is clear that a variety of reconstituted regulatory elements can be employed to 
achieve detectable directional cloning. For example, reconstituted regulatory elements 
5 that find use with the present invention include, but are not limited to, promoters, 

repressors, operators, enhancers, enzyme recognitions sites, selectable markers, and 
conditional origins of replication, among others. It is also contemplated that the 
reconstituted regulatory element may comprise a negative selection capability, such 
that fragments cloned in an undesired orientation reconstitute the regulatory element 

10 and are selected against. One skilled in the art will recognize the wide range of 
regulatory elements and applications that can be applied to this system. 

To demonstrate the effectiveness of the above approach, the lac operator was 
employed to direct directional subcloning events. Luria and colleagues observed in the 
early 1960s that phage carrying the binding site for the lac repressor, lacO, could 

15 induce the expression of the endogenous lacZ gene by titrating out a limited number of 

repressor proteins [Miller and Reznikoff, Eds. (1978) The Operon, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, NY] and this was shown to be true when lacO was 
present on high copy number plasmids [Marians et al (1976) Nature 263:744; and 
Heyneker et al (1976) Nature 263:748], as illustrated in Figure 22 A. Figure 22 A 

20 shows a schematic representation of normal conditions in the absence of inducer (left 

diagram) where lacR is bound to the lac operator sites in front of lacZ and represses 
transcription. In the presence of high copy number plasmid containing the lacO 
sequence (right diagram), LacR repressors are titrated out by binding to plasmid borne 
lacO sites and the endogenous lacZ gene is expressed. 

25 This observation was taken advantage of by the methods of the present 

invention, whereby the 3' half of a lacO site was placed on a pUNI vector (i.e., pUNI- 
30). The lacO derivative used was a symmetrical 20 bp site that has a Eco47III site at 
the center. To utilize this method for cloning PCR derived material, primers were 
made corresponding to the SKP\ gene. A 10 bp sequence corresponding to the 5' half 
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of the symmetrical lacO sequence (shown in Figure 22B) was added to the 5' end of 
the 3' primer. Figure 22B shows this strategy, whereby primer A (5') and B (3') are 
used to amplify the gene of interest. The 5' end of primer B contains a half lacO site 
which subsequently becomes the 3-end of the PCR fragment indicated in the Figure. 

5 After ligating the PCR fragment into linearized pUNI-30 containing the other half of 

lacO, an intact lacO site is reconstituted and, in Lac + cells, results in induction of 
endogenous p-galactosidase and production of blue colonies in the presence of X-Gal. 
The PCR fragment was ligated into £co47/Z/-cleaved pUNI-30 and transformed into 
BUN 10, a Lac + E. coli strain, and Kn r colonies were selected on plates containing X- 

10 gal. Plasmids containing SKP\ in the proper orientation were identified by their dark 

blue color (shown by arrows in Figure 22C). Reclosure of the vector without insert as 
well as the presence of the PCR fragment in the incorrect orientation result in the 
production of white or pale blue colonies. Ten out of 10 dark blue colonies contained 
SKP\ in the correct orientation. In particularly preferred embodiments, phosphorylated 

15 PCR primers are used. In other preferred embodiments, Taq polymerase is used, and 

the material is preferably treated briefly with T4 polymerase and dNTPs to remove the 
V overhangs generated. 

i) Library Transfer Using UPS 

In addition to permitting the rapid transfer of a gene of interest from a 
20 particular pUNI vector containing a gene of interest into a pHOST vector, the 

Univector Fusion System permits the rapid exchange of an entire cDNA library to a 
variety of expression vectors. This capability to essentially transform one library into 
many libraries is one of the most significant advances made possible by the UPS 
methods provided by the present invention. The high efficiency of the in vitro UPS 
25 reaction {i.e., a minimum of 16.8%) coupled with the extremely high efficiency of 

modern transformation methods makes possible the conversion of whole cDN A 
libraries constructed in the Univector into expression libraries without loss of 
representation. Thus, it is contemplated that single cDNA libraries will be converted 
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into any of a number of different expression libraries such as those used in the two 
hybrid systems [Durfee et al (1993) Gene. & Dev. 7:55; and Aronheim et al (1997) 
Mol Cell Biol 17:3094], for complementation cloning in yeast [Elledge et al (1991) 
Proc. Natl Acad Sci. 88:1731], mammalian expression systems [Okayama and Berg 
5 (1982) Mol Cell Biol 2:161], etc. Thus, the present invention provides methods such 

that libraries made for one purpose will no longer need to be remade from scratch 
when needed in a different context; clones isolated from these libraries are easily 
converted back into simple Univector plasmids compatible with other pHOST vectors 
for future analysis. 

10 In these methods, the cDNA library is generated using a pUNI vector as the 

cloning vector (a pUNI library). The entire library may then be transferred (using 
either an in vitro or an in vivo recombination reaction) into any expression vector 
modified to contain a sequence-specific recombinase target site (e.g., a lox site) (i.e., 
into a pHOST vector). This solves an existing problem in the art, in that there is no 

15 way, using existing vector systems, to exchange the inserts in a library made in one 

expression vector en masse (i.e., as an entire library) to a different expression vector. 
Example 10 provides an illustration of such capabilities using methods of the present 
invention. 

In addition, the sequences contained within a pUNI library can be used to 
20 recombine with linear X constructs (which can then be used to isolate specific genes by 

complementation of appropriate host cell such as E. coli or S. cerevisiae mutant cells). 
For example, UPS is compatible with the XYES series of lambda cloning vectors that 
use cre-lox recombination to convert phage clones into plasmids. These vectors are 
capable of making extremely large cDNA libraries (i.e., greater than 10 8 recombinants 
25 per 100 ng of cDNA) and, unlike plasmid libraries, can be propagated with minimal 

loss of representation. Further as described in Example 7, the in vivo gene trap 
method, a variation of the Univector Fusion System, can be used to transfer linear 
DNA fragments that lack a selectable marker, such as a PCR product, into a variety of 
expression vectors. 
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An extremely important application of the UPS method is in the manipulation 
of whole genome sets of coding regions. For organisms whose genomes have been 
sequenced, a complete set of identified ORFS, or "Unigene" set, can be constructed in 
the Univector and be systematically converted by UPS into any kind of expression 
5 library. Also, the simplicity and uniformity of the UPS reaction makes it readily 

amenable to automation for systematic conversion of arrayed clones. This greatly 
expedites the functional characterization of whole genomes and help further the 
progression of genome projects into proteome projects. 

EXPERIMENTAL 

10 The following examples serve to illustrate certain preferred embodiments and 

aspects of the present invention and are not to be construed as limiting the scope 
thereof. 

In the experimental disclosure which follows, the following abbreviations 
apply: °C (degrees Centigrade); g (gravitational field); vol (volume); DNA 

15 (deoxyribonucleic acid); RNA (ribonucleic acid); kdal or kD (kilodaltons); OD (optical 
density); EDTA (ethylene diamine tetra-acetic acid); E. coli (Escherichia coli); SDS 
(sodium dodecyl sulfate); PAGE (polyacrylamide gel electrophoresis); ts (temperature 
sensitive); p (plasmid); LB (Luria-Bertani medium: per liter: 10 g Bacto-tryptone, 5 g 
yeast extract, 10 g NaCl, pH to 7.5 with NaOH); ml (milliliter); jal (microliter); M 

20 (Molar); mM (millimolar); jaM (microMolar); g (gram); jag (microgram); ng 

(nanogram); U (units), mU (milliunits); min. (minutes); sec. (seconds); % (percent); bp 
(base pair); kb (kilobase); PCR (polymerase chain reaction); Tris (tris(hydroxymethyl)- 
aminomethane); PMSF (phenylmethylsulfonylfluoride); BSA (bovine serum albumin); 
IPTG (isopropyl-p-D-thiogalactoside); ORF (open reading frame); ATCC (American 

25 Type Culture Collection, Rockville, MD); Bio-Rad (Bio-Rad Corp., Hercules, CA); 

Invitrogen (Invitrogen, Corp., San Diego, CA); New England Nuclear/Du Pont 
(Boston, MA); Novagen (Novagen, Inc., Madison, WI); Pharmacia or Pharmacia 
Biotech (Pharmacia Biotech, Piscataway, NJ); Pharmingen (PharMingen, San Diegi, 
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CA); Gibco BRL (Gaithersburg, MD); and Stratagene (Stratagene Cloning Systems, La 
Jolla, CA). 

EXAMPLE 1 

Construction Of Univector Constructs 

5 In this example, illustrative Univector constructs are provided. The map for 

several Univectors is shown in Figure 23, showing pUNI-10, pUNI-20, and pUNI-30. 
In this figure, nucleotide positions (in parentheses) of unique restriction enzyme 
cleavage sites are shown. Functional sequences are shown as filled boxes and are 
labeled inside of the circle. Boxes with arrows are genes transcribed in the direction 

10 of the arrow. Below each map is the sequence of the polylinker region displayed as 

coding triplets in frame with the open reading frame of loxP. Unique restriction 
enzyme cleavage sites are in bold. General features of these Univectors include a loxP 
site placed adjacent to the 5' end of a polylinker for insertion of cDNAs. lox? has a 
single open reading frame that is in frame with the ATG of the Ndel and Ncol sites of 

15 the polylinker. This facilitates the subsequent generation of protein fusions as noted 

below. Following the polylinker are bacterial and eukaryotic transcriptional 
terminators to facilitate 3' end formation of transcripts. The Univectors also comprise 
a conditional origin or replication derived from R6Ky that allows their propagation 
only in bacterial hosts expressing the pir gene originally from R6Ky [Metcalf et al 

20 (1994) Gene 138:1]. The Univectors also have the neo gene from Tn5 for selection in 

bacteria (e.g., selection of recombinant products of UPS is achieved by selecting for 
kanamycin resistance after transformation into a pir strain because the neo gene on the 
pUNI can only be propagated when covalently linked to an origin or replication that is 
functional in a pir background). pUNI-20 contains additional site specific 

25 recombination sites, such as RS, that facilitate precise ORF transfer (POT), as 

described below. 
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One Uni vector construct, the pUNI-10 vector, contains a loxP site, a kanamycin 
resistance gene (Kn R ) and the R6Ky conditional origin of replication (OriR R6Ky ). The 
OriR R6Ky is functional only in E. coli strains expressing the n replication protein {i.e., 
the product of the pir gene). A gene of interest is placed within pUNI-10 (either as a 

5 result of constructing a library in pUNI-10 or by subcloning a previously cloned gene 
of interest). Once the gene of interest is contained within pUNI-10, any number of 
plasmid expression constructs containing this gene of interest can be constructed 
rapidly {e.g., within a single day). The expression constructs will contain an antibiotic 
resistance gene other than kanamycin {e.g., ampicillin). Using the site-specific 

10 recombinase, Cre, a precise fusion between the pUNI vector and any other loxP site- 
containing vector comprising the desired expression signals adjacent to the loxP site is 
catalyzed. The site-specific recombination event which occurs between the single loxP 
sites located on each plasmid {e.g., pUNI and the expression vector) results in the 
stable fusion of these two plasmids in such a manner as to place the expression of the 

15 gene of interest under the control of the expression signals contained within the 

expression vector. This subcloning event occurs without the need to use restriction 
enzymes. The fusion of pUNI-10 and the expression vector is selected for by selecting 
for the ability of E. coli cells that do not express the n protein to grow in the presence 
of kanamycin. pUNI cannot replicate in E. coli cells that do not express the n protein 

20 unless pUNI has fused or integrated into another plasmid that contains a normal {i.e., 

not a conditional) origin of replication {e.g., the Col El origin). In this case, pUNI 
will be replicated (as part of the fusion plasmid) and kanamycin resistance will be 
conferred on the host cell. 

a) Generation of pUNI-10 

25 Figure 2 A provides a schematic map of the pUNI-10 vector; the locations of 

selected restriction enzyme sites are indicated (with the exception of Notl, all sites 
shown are unique). Figure 2B shows the DNA sequence of the loxP site and the 
polylinkers contained within pUNI-10 {i.e., nucleotides 401-530 of SEQ ID NO:l). 
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Nucleotides 1-400 of pUNI-10 contain the conditional origin of replication 
from R6K7 (OriR R6K? ); the OriR R6Ky was derived from the plasmid R6K (ATCC 
37120) [Metcalfe/ al (1996) Plasmid 35:1]; nucleotides 401-414 comprise a Not\- 
Kpnl poly linker that facilitates the exchange of lox sites; pUNI-10 contains a wild-type 

5 loxP site (as discussed above, pUNI vectors containing modified lox sites may be 

employed). Nucleotides 415-448 comprise the wild-type loxP site; nucleotides 449- 
527 comprise a polylinker used for the insertion of the gene of interest (genomic or 
cDNA sequences). Nucleotides 528-750 contain the polyA addition sequence from 
bovine growth hormone (BGH) (the BGH polyA sequence is available on a number of 

10 commercially available vectors including pcDNA3.1 (Invitrogen)); the BGH polyA 

sequence provides a 3' end for transcripts expressed in mammalian and other 
eukaryotic cells. The art is aware of other eukaryotic polyA sequences that may be 
used in place of the BGH polyA sequence (e.g., the SV40 poly A sequence, the TK 
polyA sequence, etc.). Nucleotides 751-890 contain the T7 terminator sequence which 

1 5 is used to terminate transcription in prokaryotic hosts (numerous prokaryotic 

termination signals are known to the art and may be employed in place of the T7 
terminator sequence). Nucleotides 890-895 comprise an EcoRY restriction enzyme 
recognition site and nucleotides 896-2220 comprise the kanamycin resistance gene 
(Kan or Kn R ) from Tn5 which provides a positive selectable marker. The Kn R gene 

20 found on pUNI-10 was modified using site-directed mutagenesis to remove the 

naturally occurring Ncol site such that pUNI-10 contains a unique Ncol site in the 
polylinker region located at nucleotides 449-527. pUNI vectors need not contain a 
Kn R gene (modified or wild-type); other selectable genes may be used in place of the 
Kn R gene (e.g., ampicillin resistance gene, tetracycline resistance gene, zeocin™ 

25 resistance gene, etc.). The pUNI vector need not contain a selectable marker, although 

the use of a selectable marker is preferred. When a selectable marker is present on the 
pUNI vector, this marker is preferably a different selectable marker than that present 
on the pHOST vector. The nucleotide sequence of pUNI-10 is provided in SEQ ID 
NO:l. 
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EXAMPLE 2 

Construction Of Host Plasmids For Use In The Univector Plasmid-Fusion System 

Host plasmids used in the Univector plasmid fusion system are referred to as 
pHOST plasmids. pHOST plasmids or vectors are generally expression vectors that 
5 have been modified by the insertion of a site-specific recombination site, such as a lox 

site. The presence of the lox site on the pHOST plasmid permits the rapid subcloning 
or insertion of the gene interest contained within a pUNI vector to generate an 
expression vector capable of expressing the gene of interest. The pHOST vector may 
encode a protein domain such as an affinity domain including, but not limited to, 

10 glutathione-S-transferase (Gst), maltose binding protein (MBP), a portion of 

staphylococcal protein A (SPA), a polyhistidine tract, etc. A variety of commercially 
available expression vectors encoding such affinity domains are known to the art. 
When the pHOST plasmid contains a vector-encoded affinity domain, a fusion protein 
comprising the vector-encoded affinity domain and the protein of interest is generated 

15 when the pUNI and pHOST vectors are recombined. 

In some embodiments of the present invention, the host vector features include 
the Col El origin of replication and the bla gene for propagation and selection in 
bacteria, a lox? site for plasmid fusions and a specific promoter residing upstream of, 
and adjacent to, the lox? site. Host vectors may also comprise sequences responsible 

20 for propagation, selection, and maintenance in organisms other than E. coll 

To generate expression vectors intended to generate transcriptional fusions (i.e., 
pHOST does not contain a vector-encoded protein domain), a lox site is placed after 
(i.e., downstream of) the start of transcription in the host vector. This is easily 
accomplished using synthetic oligonucleotides comprising the desired lox site. In 

25 designing the oligonucleotide comprising the lox site, care is taken to avoid 

introducing an ATG or start codon that might initiate translation inappropriately. 

To generate expression vectors intended to generate a fusion protein between a 
vector-encoded protein domain and the protein of interest (encoded by the gene of 
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interest contained within the pUNI vector), care is taken to place the lox site in the 
correct reading frame such that 1) an open reading frame is maintained through the lox 
site on pHOST and 2) the open reading frame in the lox site on pHOST is in frame 
with the open reading frame found on the lox site contained within the pUNI vector. 

5 In addition, the oligonucleotide comprising the lox site on pHOST is designed to avoid 

the introduction of in-frame stop codons. The gene of interest contained within the 
pUNI vector is cloned in a particular reading frame so as to facilitate the creation of 
the desired fusion protein. 

The modification of several expression vectors is provided below to illustrate 

10 the creation of suitable pHOST vectors. In each case, the general strategy involved the 
generation of a linker containing a lox site by annealing two complementary 
oligonucleotides. The annealed oligonucleotides form a linker having sticky ends that 
are compatible with ends generated by restriction enzymes whose sites are 
conveniently located in the parental expression vector {e.g., within the poly linker of 

15 the parental expression vector). 



a) Modification of the pGEX-2TKcs Prokaryotic Expression Vector 

pGEX-2TKcs is an expression vector active in E. coli cells which is designed 
for inducible, intracellular expression of genes or gene fragments as fusions with Gst. 
pGEX-2TKcs contains the IPTG-inducible tac promoter (P^) and was derived from 

20 pGEX-2TK (Pharmacia Biotech) as follows. The polylinker sequence of pGEX-2TK, 

5'-GGATCCCCGGGAATTC-3' (SEQ ID NO:2), was replaced with the following 
sequence: 5 ' -GGATCGC AT ATGCCC ATGGCTCGAGG ATCCG AATTC-3 ' (SEQ ID 
NO:3) to generate the pGEX-2TKcs vector. 

A linker containing a loxP site was generated by annealing the following 

25 oligonucleotides: 5 ' -C ATGGCTAT AACTTCGT AT AGC AT AC ATT AT ACG AA 

GTTATG-3' (SEQ ID NO:4) and 5' -GATCC AT AACTTCGT AT AATGTATGC 
T AT ACG A AGTTAT AGC-3 ' (SEQ ID NO:5). When annealed, these two 
oligonucleotides form a double-stranded linker having a 5' end compatible with an 
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Ncol sticky end and a 3' end compatible with a BamHl sticky end (Figure 3 A). 
pGEX-2TKcs was digested with Ncol and BamHl (Figure 3B) and the annealed loxP 
linker was inserted to form pGst-fox. 

b) Modification of the pVL1392 Baculovirus Expression Vector 

5 pVL1392 is an expression vector that contains the polyhedrin promoter which 

is active in insect cells (Pharmingen). A linker containing a loxP site was generated 
by annealing the following oligonucleotides: S'-GGCCGGACGTCATAACTTCGTAT 
AGC ATACATTATACG AAGTT ATG-3 9 (SEQ ID NO:6) and 5 ' -GATCC ATAACTTC 
GTATAATGTATGCTATACGAAGTTATGACGTCC-3' (SEQ ID NO:7). When 

10 annealed, these two oligonucleotides form a double-stranded linker having a 5' end 

compatible with a Not\ sticky end and a 3' end compatible with a BamHl sticky end 
(Figure 4A). pVL1392 was digested with Not! and BamHl (Figure 4B) and the 
annealed loxP linker was inserted to form pVL1392-/ax. 

c) Modification of the pGAP24 Yeast Expression Vector 

1 5 pGAP24 is an expression vector that is based on the yeast 2 jam circle and 

contains the constitutive GAP (glyceraldehyde 3 -phosphate dehydrogenase) promoter 
( p gap) which is active in yeast cells and the TRP1 gene (used a selectable marker when 
the cells are grown in medium lacking tryptophan) [the GAP promoter is available on 
pAB23; Schilds (1990) Proa Natl Acad. Sci. USA 87:2916]. A linker containing a 

20 loxP site was generated by annealing the following oligonucleotides: 5'-TCGAGAC 
GTCATAACTTCGTATAGCATACATTATACGAAGTTATGC-3 , (SEQ ID NO:8) 
and 5 5 -GGCCGC AT A ACTTCGT AT A ATGT ATGCT AT ACG A AGTT ATG ACGTC-3 ' 
(SEQ ID NO:9). When annealed, these two oligonucleotides form a double-stranded 
linker having a 5' end compatible with a Xhol sticky end and a 3' end compatible with 

25 a Notl sticky end (Figure 5A). pGAP24 was digested with Xhol and Noil (Figure 5B) 

and the annealed loxP linker was inserted to form pGAP24-/<xc. 
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d) Modification of the pGAL14 Yeast Expression Vector 

pGAL14 is a yeast centromeric expression vector that contains the GAL 
promoter (P GAL )> which is induced by the presence of galactose in the medium, and the 
TRP1 gene. A linker containing a loxP site was generated by annealing together the 
5 oligonucleotides listed in SEQ ID NOS:8 and 9. When annealed, these two 

oligonucleotides form a double-stranded linker having a 5' end compatible with aXhol 
sticky end and a 3' end compatible with a Notl sticky end (Figure 6A). pGAL14 was 
digested with Xhol and Notl (Figure 6B) and the annealed loxP linker was inserted to 
form pGAL14-/ox. 

10 EXAMPLE 3 

Expression And Purification Of A Gst-Cre Fusion Protein 

In order to provide a source of purified Cre recombinase for the in vitro 
recombination of plasmids, the cre gene was inserted into a Gst expression vector such 
that a fusion protein comprising Gst at the amino-terminal end and Cre recombinase at 

15 the carboxy-terminal end was produced. The Gst-Cre fusion protein was purified by 

chromatography using Glutathione Sepharose 4B (Pharmacia). Purified Gst-Cre can be 
stored at -80°C, -20°C, or 4°C for several months without significant loss of activity. 

To simplify Cre purification, a plasmid expressing a GST-cre fusion protein 
was constructed, pQL123. The cre gene was isolated by polymerase chain reaction 

20 (PCR) amplification using the plasmid pBS39 (U.S. Patent 4,959,317). U.S. Patent 

Nos. 4,683,195, 4,683,202 and 4,965,188 describe PCR methodology and are 
incorporated herein by reference. The primers used in the PCR were designed to 
introduce an Ncol site at the first ATG in the cre open reading frame. The PCR 
product was cloned into a TA cloning vector (pCRII.l; Invitrogen) and then was 

25 subcloned as an Ncol-EcoRl fragment into pGEX-2TKcs (Example 2) to generate 
pQL123. The ligation products were used to transform DH5a cells and the desired 
recombinant was isolated and used to transform BL21(DE3) cells (Invitrogen). 
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The nucleotide sequence of the Gst-Cre coding region within pQL123 is listed 
in SEQ ID NO: 10 (Figure 26B). The amino acid sequence of the fusion protein 
expressed by pQL123 is listed in SEQ ID NO:ll (Figure 26C). 

To express the Gst-Cre fusion protein, BL21(DE3) cells containing the pQL123 
5 plasmid were grown at 37°C in LB containing 100 |ig/ml ampicillin until the OD 600 

reached 0.6. Expression of the fusion protein was then induced by the addition of 
IPTG to a final concentration of 0.4 mM and the cells were allowed to grow overnight 
at 25 °C. Following induction, the bacterial cells were pelleted by centrifugation at 
5,000 x g at 4°C and the supernatant was discarded. A cell lysate was prepared as 

10 follows. Cells harvested from 0.5 liter of culture were suspended in 35 ml of a 

solution containing 20 mM Tris-HCl, pH 8.0, 0.1 M NaCl, 1 mM EDTA, 0.5% 
Nonidet P-40, 5 ^ig/ml of each of leupeptin, antipain, aprotinin and 1 mM PMSF at 
4°C. The cells were incubated for 10 min on ice and then disrupted by sonication (3 x 
15 sec bursts) using a sonicator (Ultrasonic Heat Systems Model 200R) at full power. 

15 The lysate was then clarified by centrifugation at 12,000 rpm using a SS34 rotor 

(Sorvall). 

The Gst-Cre fusion protein was affinity purified from the cell lysate by 
chromatography on Glutathione Sepharose 4B (Pharmacia) according to the 
manufacturer's instructions. The protein concentration of Gst-Cre was determined by 

20 Bradford analysis (BioRad). 

Aliquots of the cell lysate before and after chromatography on Glutathione 
Sepharose 4B were applied to an SDS-PAGE gel. Following electrophoresis, the gel 
was stained with Coomassie blue. The stained gel is shown in Figure 7. In Figure 7, 
lanes 1 and 2 contain the cell lysate before and after chromatography, respectively. 

25 The arrowhead indicates the Gst-Cre fusion protein. The migration of the molecular 
weight protein markers is indicated to the left of lane 1 . The results shown in Figure 
7 demonstrate the purification of the Gst-Cre fusion protein. This fusion protein was 
shown to be functional (i.e., capable of mediating recombination between lox sites) in 
the in vitro recombination assay described below. 
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Gst-Cre retained high recombinase activity as measured by UPS. The 
efficiency of this reaction reached up to 16.8% as shown in Figure 15, similar to that 
for native Cre (Abremski et ah, supra). In this figure, the indicated amounts of Gst- 
Cre were incubated with pUNI-10 and pQL103 plasmid DNA as described below. 

5 Percentage of recombinants were calculated by measuring the ratio of total kanamycin 

resistant transformants (fusion events between pUNI-10 and pQL103) relative to total 
ampicillin resistant transformants (pQL103 alone and pUNI-10-pQL103 fusions). The 
efficiency of Gst-Cre was examined in a second reaction producing a tagged 
recombinant protein as diagrammed in Figure 24, fusing a Gst tag to Skpl. 

10 Recombinant plasmids isolated from Kn r transformants were shown by restriction 

analysis to be correct fusion products between the Univector and the host vector via 
the loxP sites. In this case, 10 of 12 Kn r transformants were the correct heterodimer 
(Figure 9) and 2 were trimers (Figure 9, lanes 8 and 10) with two copies of pUNI 
fused to a host vector. It should be noted that trimeric plasmids also have a correct 

15 fusion junction that places the gene of interest adjacent to the desired regulatory 

sequences and are fully functional for most needs. However, the isolation of trimeric 
plasmids can be nearly eliminated if gel purified monomeric supercoiled host DNA is 
used. This method is highly efficient and typically requires only one or two minipreps 
to identify the desired construct. 

20 EXAMPLE 4 

In Vitro Recombination Using The Univector Plasmid Fusion System 

The Univector Plasmid Fusion System permits the in vitro recombination of 
two plasmids. Figure 8 provides a schematic showing the strategy employed for in 
vitro recombination. pA represents a generic pUNI vector that contains a loxP site, a 
25 kanamycin resistance gene and the conditional R6K origin that is only functional in E. 

coli strains expressing the n protein (e.g., E. coli strains BW18815, BW19094, 
BW20978, BW20979, BW21037, BW21038). pB represents a generic pHOST vector 
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that contains a loxP site, an ampicillin resistance gene and a Col El origin of 
replication. pAB represents the fused plasmid which results from the Cre-mediated 
fusion of pA and pB. 

To illustrate the in vitro recombination reaction, pUNI-5 (a pUNI vector which 
differs from pUNI-10 only in that pUNI-5 retains the Ncol site in the Kn R gene and 
contains a different polylinker) was employed as pA and pQL103, an ampicillin- 
resistant plasmid containing a loxP site and the ColEl origin, was employed as pB. In 
a total reaction volume of 20 jil, 0.2 jag of each pUNI-5 (pA) and pQL103 (pB) were 
mixed in a buffer containing 50 mM Tris-HCl (pH 7.5), 10 mM MgCl 2 , 30 mM NaCl 
and 1 mg/ml BSA. The amount of purified Gst-Cre (Example 3) was varied from 0 to 
1.0 \ig. The reactions were incubated at 37°C for 20 minutes and then the reactions 
were placed at 70°C for 5 min. to inactivate the Gst-Cre protein. Five microliters of 
each reaction mixture were used directly to transform competent DH5a cells (CaCl 2 
treated). The transformed cells were plated onto LB/ Amp (100 \xg/m\ amp) and 
LB/Kan (40 ng/ml kan) plates and the number of ampicillin resistant (Ap R ) and 
kanamycin-resistant (Kn R ) colonies were counted. The results are summarized in 
Table 1. 



TABLE 1 



Gst-Cre (jig/reaction) 


Ap R Colonies 


Kn R Colonies 


% of Total Kn R /Ap R 


0 


2.6 x 10 4 


0 


0 


0.01 


1.9 x 10 4 


571 


3 


0.05 


1.1 x 10 4 


682 


6.2 


0.1 


1.5 x 10 4 


502 


3.3 


0.5 


0.3 x 10 4 


104 


3.4 


1.0 


0.3 x 10 4 


52 


1.7 



The results shown in Table 1 demonstrate, that under these reaction conditions 
0.05 |ig purified Gst-Cre per 20 jal reaction yields the most efficient rate of plasmid 
fusion. Plasmid DNA was isolated from individual kanamycin-resistant colonies 
(using standard mini-prep plasmid DNA isolation protocols) and subjected to 
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restriction enzyme digestion to determine the structure of the fused plasmids. This 
analysis revealed that plasmid DNA isolated from the kanamycin-resistant colonies 
represented a dimer created by the desired fusion of pUNI-5 and pQL103 via the loxP 
sites. These results demonstrate that the Univector Plasmid Fusion System can be used 
5 to rapidly fuse two plasmids together in vitro. 

EXAMPLE 5 

In Vitro Fusion Between A pUNI Vectors Containing 
Genes Of Interest And Lox-Containing Expression Vectors 
Produces Fused Vectors Capable Of Expressing The Gene Of Interest 

10 In Example 4 it was demonstrated that the Univector Plasmid Fusion System 

can be used to rapidly fuse two plasmid constructs together in vitro. In this example, 
the ability of the Univector Plasmid Fusion System to fuse two plasmids together in a 
manner that places the gene of interest contained on the pUNI vector under the 
transcriptional control of a promoter contained on the pHOST or expression vector in 

1 5 such a manner that a functional protein of interest is expressed from the fused 

construct. A series of expression plasmids were made by UPS and tested for 
expression in several contexts. 

a) Insertion Of A Gene Of Interest Into The pUM-10 Vector 

The cDNA encoding the wild-type yeast Skpl protein [Bai et al (1996) Cell 
20 86:263] was cloned into the pUNI-10 vector between the Ndel and BamHl sites to 

generate pUNI-Skpl; the yeast SKP1 cDNA sequence is available as GenBank 
Accession No. U61764. Skpl is an essential protein involved in the regulation of the 
cell cycle in yeast. Yeast cells containing a temperature sensitive mutant of Skpl 
cannot grow at the non-permissive temperature (37°C). 
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b) In Vitro Fusion Reactions And Complementation Assays 
pUNI-Skpl was recombined with pGAP24-/ox (Example 2) and pGAL14-/ox 
(Example 2) using the in vitro reaction described in Example 4; 0.2 |ug of Gst-Cre was 
used per 20 jal reaction. The resulting plasmid fusions were termed pGAP24-Skpl and 
5 pGAL14-Skpl. pGAP24-Skpl and pGAL14-Skpl were then transformed into the 

temperature sensitive (ts) skpl-11 mutant yeast strain Y555 (Bai et al y supra) and the 
transformed yeast cells were plated onto SC-tryptophan plates (to select for the 
expression of the selectable marker TRP1) and incubated at either a permissive (25°C) 
or non-permissive temperature (37°C). The plates which received yeast cells 

10 transformed with pGAL14-Skpl contained galactose. The ability of the transformed 
cells to grow at the non-permissive temperature is dependent upon the expression of 
the wild-type skpl gene encoded by a properly fused pUNI-Skpl /expression vector 
construct. As a control, the yeast SKP1 genomic clone contained in a URA3 CEN 
vector (produced by conventional cloning techniques) was used to transform the ts 

15 skpl-11 mutant yeast strain Y555 and the transformed cells were also plated at 25°C 

and 37°C. In each case, an expression vector (e.g., pRS414 or pRS415; Bai et al, 
supra) lacking the SKP1 gene but containing the same selectable marker (i.e., TRP1) 
as either pGAP24-Skpl, pGAL14-Skpl or URA3 CfiV-Skpl was used to transform 
Y555 cells as a control capable of permitting the growth of transformed Y555 cells on 

20 selective medium at the permissive temperature. 

The results demonstrated that the URA3 CEN-SKP1 construct produced by 
conventional cloning techniques produced a functional Skpl protein which was capable 
of complementing the lethality of the skpl-11 ts mutation. More importantly, the 
results demonstrated that the in viti'o fusion reaction that created pGAP24-Skpl and 

25 pGAL14-Skpl produced constructs capable of producing functional Skpl; that is, 

Y555 cells transformed with either pGAP24-Skpl or pGAL14-Skpl were capable of 
growth at 37°C, a temperature at which the ts Skpl-1 1 protein produced by the host 
strain is non-functional. Expression vectors lacking the SKP1 cDNA were incapable 
of complementing the lethality of the skpl-11 ts mutation. 
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c) Restriction Analysis, SDS-PAGE Analysis and 

Western Blot Analysis of In Vitro Fusion Reactions 

pUNI-Skpl was recombined with pGst-/ox (Example 2) using the in vitro 
reaction described in Example 4; 0.2 \ig of Gst-Cre was used per 20 |^1 reaction. The 
resulting plasmid fusion was termed pGST-Skpl. Figure 9 A provides a schematic 
showing the starting constructs and the predicted fusion construct. Five microliters of 
the fusion reaction mixture was used transform DH5a cells as described in Example 4. 
The transformed cells were plated onto LB/Amp/Kan plates and plasmid DNA was 
isolated from individual Ap R Kn R colonies. The plasmid DNAs were digested with Pstl 
followed by electrophoresis on agarose gels to examine the structure of the fused 
plasmids. A representative ethidium bromide-stained gel is shown in Figure 9B. In 
Figure 9B, lane "M" contains DNA size markers, lanes pUNI-Skpl and pGst-/ox 
contain the starting plasmids digested with Pstl and lanes 1-12 contain plasmid DNA 
from individual Ap R Kn R colonies digested with Pstl. Lanes marked with an "*" 
indicate that these colonies contained a trimeric fusion plasmid that resulted from the 
fusion of two Gst-fot plasmids and one pUNI-Skpl plasmid. The sizes of the two Pstl 
fragments which result from the fusion of pUNI-Skpl and pGst-/ox in kb are indicated 
(5.8 and 2.0 kb). The results shown in Figure 9B demonstrate that the in vitro fusion 
reaction resulted in the production of the desired fused construct with high efficiency 
(about 83% of the plasmids in the Ap R Kn R colonies comprised the fusion of one 
pUNI-Skpl vector with one pGst-/ox vector). 

Three individual Ap R Kn R colonies were picked and grown in liquid cultures 
which were induced with IPTG to examine whether the fused construct (pGst-Skpl) 
could produce the desired Gst-Skpl fusion protein. The cultures were grown, induced 
and cell extracts were prepared as described in Example 6. An aliquot of the cell 
lysates prepared from induced and uninduced cells were electrophoresed on an SDS- 
PAGE gel and the gel was either stained with Coomaise blue or transferred to 
nitrocellulose to generate a Western blot. The Western blot was probed using an anti- 
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Skpl polyclonal antibody (the antibody was raised against the yeast Skpl using 
conventional methods). The resulting Coomassie-stained gel and Western blot are 
shown in Figures 10A and 10B, respectively. 

In Figure 10A, lane "M" contains protein molecular weight markers (size in kd 
5 is indicated). Lanes marked "C" contain extracts prepared from E. coli containing a 
GST-SKPJ construct made by conventional cloning (Le. 9 the SKP1 cDNA was excised 
using restriction enzymes and inserted into pGEX-2TKcs (Example 2)). Lanes 1-3 
contain extracts from Ap R Kn R cells transformed with in vitro fusion reaction mixtures. 
Extracts prepared from uninduced cells and IPTG induced cells are indicated by "-" 

10 and "+", respectively. The arrowheads indicate the location of the Gst-Skpl fusion 
proteins. The Gst-Skpl fusion product generated from the pGST-SKPl fusion 
construct contains 15 additional amino acids which are located between the Gst domain 
and the Skpl protein sequences relative to the Gst-Skpl fusion protein expressed from 
the conventionally constructed GST-SKP1 plasmid (the additional 15 amino acids are 

15 encoded by the linker comprising the lox? site; see Figure 3). In Figure 10B, the lane 

designations are the same as described for Figure 10A. This Western blot confirms 
that the bands indicated by the arrowheads in Figure 10A represent Gst-Skpl fusion 
proteins. 

The results shown in Figures 10A and 10B demonstrate that the Univector 
20 Fusion System can be used to create an expression vector that maintains the proper 

translational reading frame and permits the expression of a fusion protein comprising 
the expression vector-encoded affinity tag and the protein of interest. 

The above results demonstrate that the Univector Fusion System can be used to 
recombine two plasmids, one containing a gene of interest but no promoter (this vector 
25 may optionally contain expression signals such as termination signals and/or 

polyadenylation signals) and the other containing a promoter and optionally other 
expression signals {e.g., splicing signals, translation initiation codons) (and optionally 
sequences encoding an affinity domain) but lacking a gene of interest, in vitro in such 
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a manner that the proper translational reading frame is maintained permitting the 
expression of a functional protein from the fused plasmids in the host cell. 

d) Additional Examples 

The S. cerevisiae SKP\ ORF (Bai et al. 9 supra) in pUNI-10 was fused to the 

5 pGST-/ox host vector pHB2-GST by UPS to create a bacterial Gst-lox-Skpl fusion 
protein expressed under the control of the E. coli tac promoter. A similar Gst-Skpl 
expression plasmid lacking lox? (i.e., pCB149) made by conventional cloning, was 
used as a control. Approximately equal amounts of the two fusion proteins were 
expressed as shown in Figure 16A and B, indicating that the presence of lox? did not 

10 significantly affect either the transcription or translation of the fusion protein. In this 

figure, proteins were separated by SDS-PAGE and stained with Coomassie blue 
(Figure 16 A) or immunoblotted (Figure 16B) with anti-Skpl antibodies. Protein from 
a control GST-Skpl expression plasmid lacking lox? (lanes 1 and 2) and three 
independent transformants of UPS-derived Gst-/ax-Skpl expression constructs (lanes 3- 

1 5 8) are shown. The asterisk denotes a degradation product. 

In another example, to measure the effect of the lox? sequence upon eukaryotic 
expression in the context of transcriptional fusions, the SKPl ORF was placed under 
the control of the S. cerevisiae GAL\ promoter both by conventional means and by 
UPS. In this case, it was observed that the relative expression level of the UPS- 

20 derived plasmid was slightly lower. This reduction in expression might be explained 

by the ability of lox? RNA to form a 13 bp stem-loop, as secondary structures formed 
within the 5' UTR of an mRNA can interfere with the initiation of translation [Kozak 
(1989) Mol Cell. Biol 9:5134], although an understanding of the mechanism is not 
required to practice the present invention, and the present invention is not limited to 

25 any particular mechanistic explanation. To test this hypothesis, a series of lox sites 
were made containing mutations designed to reduce the stability of the stem-loop, as 
described in Example 8. 
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In yet other examples, multiple genes have been tested using UPS and 
expressed in several different organisms. In addition to Gst-Skpl expression in 
bacteria, Myc-Rnr4 and Myc-Rad53 have been expressed in S. cervisiae as shown in 
Figure 17, showing a comparison of expression levels between loxP and loxU 
5 containing constructs. Protein extracts were prepared from Y80 cells grown in SC-ura 

plus galactose containing the following plasmids: vector alone (lane 1), pMH176 
(GAL-MYC3-RNR4) made by conventional cloning lacking a lox sequence (lane 2), 
UPS-derived GAL-lox-MYC3-RNR4 constructs with either lox? (lane 3) or loxK (lane 
4) present between the GAL1 promoter and the MYC3-RNR4 gene, vector alone (lane 

10 5), and UPS-derived GALl-MYC34ox-RAD53 construct (lane 6). The recipient vector 

for RAD53 was pHY314-MYC3. 

Furthermore, many baculovirus expression constructs have been made by UPS 
and tested. Shown in Figure 18, as illustrative examples, are Gst-Rad53, Myc-Rad53, 
and HA-Rad53. For Rad53, the UPS-derived constructs express at the same level as 

15 Gst-Rad53 made by conventional methods (Figure 18, compare lanes 1 and 2). Figure 

18 shows the expression of the UPS-derived baculovirus expression constructs in insect 
cells. UPS reactions were performed between pUNI-10-RAD53 clones and 
baculovirus expression vectors in pVL1392 backbones engineered to contain lox sites 
and epitope tags. Host insect expression vectors used were pHUOO-GST, pHUOO- 

20 MYC3, and pHI100-HA3 and the resulting fusion plasmids were crossed onto 

Baculogold (Pharmingen) by standard methods. GST affinity purified protein from 
lysates from 1 million cells infected with baculovirus expressing either GST-RAD53 
made by conventional cloning (lane 1) or UPS (lane 2) were fractionated on a SDS- 
PAGE and Coomassie stained. Western blots of protein prepared from cells infected 

25 with the baculoviruses containing vector alone (lane 3), UPS-derived MYC3-lox- 

RAD53 (lane 4), vector alone (lane 5), or UPS-derived HA3-lox-RAD53 (lane 6) were 
probed with anti-Myc (lanes 3-4) or anti-HA (lane 5-6) monoclonal antibodies. 

In yet other examples, in mammals, the present invention demonstrated 
expression of a Myc-tagged F-box protein under the control of the CMV promoter 
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when transfected into Hela cells as shown in Figure 19. This figure shows 
immunoblotting of whole cell lysates with anti-HA antibodies. The cells used were 
Hela cells transfected by the calcium phosphate method with the CMV expression 
vectors pHM200-HA3 or pHM200-HA3-F3, expressing an HA-tagged F-box protein. 
5 In all, over 200 UPS derived constructs have been made and tested, showing 

expression success rates indistinguishable from those of conventional cloning methods. 

EXAMPLE 6 

Construction Of An E. coli Strain That Inducibly Expresses Cre Recombinase 

An E. coli strain containing a cre gene under the control of an inducible 

10 promoter, termed the QLB4 strain, was constructed as follows. The cre gene was 

placed under the transcriptional control of the inducible lac promoter by inserting the 
cre ORF into a derivative of pNN402 [Elledge et al (1991) Proc. Natl Acad. Sci. 
USA 88:1731]; pNN402 was modified to contain a lac promoter. This construct was 
then crossed onto lambda phage (e.g., XgtU) using conventional techniques. The 

15 recombinant lambda phage carrying the lac-cre gene was integrated into the 

chromosome of E. coli strain JM107 to generate the QLB4 strain. 

Expression of Cre recombinase was induced by growing QLB4 cells at 37°C 
until an OD 600 of 0.6 was reached. The culture was then split into 2 parts and IPTG 
was added to one part to a final concentration of 0.4 mM. As a control, the BNN132 

20 strain (ATCC 47059; Elledge et al (1991), supra] which contains the cre gene under 

the transcriptional control of the endogenous cre promoter was treated as described for 
the QLB4 strain. Cell extracts (total protein) were prepared from all four samples 
(QLB4 ± IPTG and BNN132 ± IPTG) and examined for expression of Cre 
recombinase by Western blotting analysis. The Western blot was probed using a rabbit 

25 polyclonal anti-Cre antibody (Novagen) as the primary antibody and a goat anti-rabbit 

IgG horseradish peroxidase conjugate (Amersham) as the secondary antibody according 
to the manufacturer's instructions. Figure 11 shows a Western blot containing extracts 
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prepared from (shown left to right) BNN123 cells grown in the absence of IPTG ("C") 
and QLB4 cells grown in the absence ("QLB4 -") and presence of IPTG ("QLB4 + n ), 
respectively. The location of the Cre recombinase band is indicated by the arrowhead. 
The additional bands seen on this Wesrtern blot are due to cross-reactivity of the crude 

5 (i.e., not affinity purified) rabbit anti-Cre antibody with bacterial proteins. 

Western blot analysis demonstrated that Cre protein could not be detected in 
BNN123 cells grown in the presence or absence of IPTG. Cre protein was detected in 
QLB4 cells grown in the presence of IPTG, but not in the absence of IPTG, by 
Western blot analysis. Therefore, the expression of Cre recombinase in QLB4 cells is 

10 greatly induced by the presence of IPTG in the growth medium. By this analysis, the 

expression of Cre recombinase in QLB4 cells is dependent upon the induction of the 
lac-cre gene by IPTG. However, more sensitive functional assays indicate that the Cre 
protein was expressed constitutively at very low levels in both BNN132 cells and 
QLB4 cells in the absence of IPTG. In these functional assays, a pUNI vector (Kn R ) 

15 and a pHOST vector (Ap R ) were cotransformed into QLB4 cells and the transformed 

cells were grown on plates containing kanamycin to select for the presence of the 
pUNI-pHOST fusion plasmid. Plasmid DNA was isolated from individual kanamycin- 
resistant colonies and subjected to restriction enzyme digestion to examine the 
structure of the plasmid DNA, This analysis revealed that multiple isoforms of the 

20 plasmid fusion product were present in the plasmid DNA isolated from any single 

kanamycin-resistant colony. While not limiting the present invention to any particular 
mechanism, it is believed that low level constitutive expression of Cre recombinase 
leads to multiple fusion events between the pUNI and pHOST vectors resulting in the 
production of multimeric forms (i.e., trimer, tetramer, etc.) of the fused plasmid (the 

25 desired fused plasmid is a dimer formed by fusion of pUNI and pHOST). The 

multimeric plasmid fusion products would be expected to be unstable due to the fact 
that the Cre protein is constitutively expressed in QLB4 cells. 

To overcome the potential problems that low level constitutive expression of 
the cre gene in the host cell may cause, the expression of cre can be more tightly 

30 controlled as described below. In addition to the approaches described below, the 
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pUNI and pHOST vectors can be modified as described in Example 7 and these 
modified vectors can be fused using a host cell that constitutively expresses the Cre 
protein. 

The expression of Cre recombinase can be more tightly controlled by a variety 
5 of means. For example, the expression of the cre gene can be made conditional when 

expressing cre under the control of the lac promoter by growing the host cells in 
medium containing glucose. The presence of 0.2% glucose in the growth medium 
virtually shuts down transcription from the lac promoter. In addition, the lac promoter 
can be modified to insert additional operator (o) sites which bind the lac repressor. 
10 Other tightly controlled promoters are known to the art (e.g., the T7 promoter which 

requires the expression of T7 RNA polymerase; these promoters are available on the 
pET vectors (Novagen)) and may be employed to control the expression of the cre 
gene. 

In addition to placing the cre ORF under the control of a tightly controlled 
15 promoter, Cre expression can be tightly controlled by placing the cre gene on a 

plasmid containing a temperature- sensitive (ts) replicon (e.g., rep pSClOl*). When the 
cre gene is carried on a ts replication plasmid, Cre will be expressed during the 
transformation of the host cell (because the host cell containing the ts plasmid 
containing the cre gene was maintained at the permissive temperature) but will be 
20 absent following recombination of the pUNI and pHOST vectors when the host cell is 
grown at a temperature non-permissive for replication of the ts replicon. 



EXAMPLE 7 

In Vivo Recombination In Prokaryotic Hosts Using The Univector Fusion System 

As discussed above, Cre-ZoxP-mediated plasmid fusion can occur in vivo, 
25 although the reverse reaction, resolution of heterodimers, might decrease its utility. 

Ideally, it would be desirable to have Cre present only transiently to catalyze the initial 
fusion event, then absent to allow the stable propagation of the recombinant products. 



- 59 - 




Therefore, a model was tested whereby UPS was explored in vivo in the E. coli stain 
BUN 13 that conditionally expresses Cre recombinase under lac control and in a second 
strain carrying cre on a plasmid, pQL269, with a Ts origin of replication derived from 
pSClOl. Experiments using BUN 13 and co-transformation of pUNI-10 and pQL103, 
5 an Ap7oxP containing plasmid, showed that the UPS reaction occurred efficiently, but 
many colonies had a mixture of plasmids that required retransformation into non-cre- 
expressing strain to stabilize. However, results with the Ts plasmid were better. 
Competent cells were prepared from JM107/pQL269 cells grown at 42°C for several 
hours to cause loss of pQL269. Co-transformation of pUNI-10 and pQL103 into these 

10 cells followed by selection on kanamycin plates at 42°C revealed that 25% contained 
the desired single pUNI-10-pQL103 co-integrant These two experiments 
demonstrated that UPS can be used to generate plasmid fusions in vivo and provide an 
alternative to the in vitro reaction when Gst-Cre is not available. 

As described in Example 6 and the experiments above, cotransformation of E. 

15 coli cells expressing Cre protein {e.g., QLB4, BNN132) with a pUNI construct and a 

pHOST construct (each construct containing a single lox site) results in the fusion of 
these two constructs in vivo. If the host cell used for the recombination reaction 
constitutively expresses the Cre protein, multimeric forms of the fused constructs are 
generated. In addition to the methods outlined above for tightly regulating the 

20 expression of the cre gene in the host cell, cells constitutively producing Cre protein 

can be employed with modified pUNI and pHOST vectors as described in this 
example. The pUNI construct is modified such that two different lox sites flank the 
kanamycin resistance gene (the modified pUNI construct is termed pUNI-D). The two 
lox sites differ in their spacer regions by one or two nucleotides and for the sake of 

25 discussion the two different lox sites are referred to as "lox A" and M tacB" (e.g., lox? 

and lox?5l\; 7ojcB m is used in this discussion to distinguish it from the first lox site 
termed "VoxA" and does not indicate the use of the loxB sequence found in the E. coli 
chromosome). Cre cannot efficiently catalyze a recombination event between a lox A 
site and a loxB due to the sequence changes located in the spacer regions between the 
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Cre binding sites; however Cre can efficiently catalyze the recombination between two 
loxA sites or two loxB sites [Hoess et al (1986) Nucleic Acids Res. 14:2287]. The 
pHOST construct is modified such that one loxA site and one loxB site flank the 
selectable marker gene (the modified pHOST construct is termed pHOST-D). In this 
5 example, pHOST contains the sacB gene as the selectable marker (a negative 

selectable marker). The presence of the sacB gene on pHOST-D provides a means of 
counter-selection as cells expressing the sacB gene are killed when the cell is grown in 
medium containing 5% sucrose [Gay et al (1985) J. Bacteriol. 164:918 and (1983) J. 
Bacteriol 153:1424]. 

10 Figure 12 provides a schematic showing the strategy for in vivo recombination 

in a Cre-expressing host cell (e.g., QLB4 cells) using the pUNI-D and pHOST-D 
constructs. Arrows are used to indicate the direction of transcription of various genes 
or gene segments in Figure 12. In Figure 12, the following abbreviations are used: 
Ap R (ampicillin resistance gene); Kn R (kanamycin resistance gene); Ori (non- 
15 conditional plasmid origin of replication); Ori R (the R6Ky conditional origin of 

replication); Cre (Cre recombinase); GENEX (gene of interest). The strategy outlined 
in Figure 12 is referred to as the "in vivo gene-trap." Figure 12 illustrates that the 
second lox site (loxB) in pUNI-D (relative to the design of the pUNI-10 vector) is 
inserted between the kanamycin resistance gene and the R6Ky conditional origin of 
20 replication. 

To generate a pHOST-D construct, a commercially available expression vector 
containing the desired promoter (and optionally enhancer) is modified as described in 
Example 2 to insert the loxA site downstream of the promoter. However, it is not 
necessary that a commercially available expression vector be employed as the art is 
25 well aware of methods for the generation of expression vectors. Sequences encoding 

the sacB gene [Gay et al (1983) J. Bacteriol. 153:1424; GenBank Accession Nos. 
X02730 and KOI 987] and the second lox site (loxB) are inserted downstream of the 
first lox site (lox A). 
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The pUNI-D and pHOST-D constructs are cotransformed into QLB4 cells 
(Example 6) and the transformed cells are plated onto LB/Ap/Kn plates containing 5% 
sucrose to select for the desired recombinant. Figure 12 illustrates the recombination 
events that will occur in the presence of Cre in the QLB4 cells. First pUNI-D and 
5 pHOST-D will fuse to form two dimers in which two possible double cross-over 

events can occur. These two double cross-over events are diagrammed in Fig 12. The 
double cross-over events will result in the exchange of the DNA segments that are 
flanked by loxA and loxB to produce the plasmids labelled "A n and "B." All plasmids 
that contain the sacB gene (the pHOST-D, the fused plasmids and plasmid B) will be 

10 selected against by the presence of sucrose in the growth medium. The pUNI-D 

construct will not be able to replicate in QLB4 cells as these cells do not express the n 
protein required for replication of the R6Ky origin. Therefore, the only construct that 
will be maintained in QLB4 cells selected on LB/Kn containing sucrose is the desired 
plasmid A in which the gene of interest from pUNI-D has been placed under the 

15 transcriptional control of the promoter located on pHOST-D. 

To illustrate this method, pUNI-10 was modified to place a second lox site, 
comprising the lox?5\\ sequence (SEQ ID NO: 16) between the kanamycin resistance 
gene and the R6Ky conditional origin of replication to create pUNI-10-D. A second 
lox site, comprising the loxVSW site, was inserted onto a /oxP-containing expression 

20 plasmid {i.e., a pHOST vector) to create a pHOST-D vector. One-half of one 

microgram of each plasmid was cotransformed into competent QLB4 cells and an 
aliquot of the transformed cells were plated onto LB/Ap plates and onto LB/Ap/Kn 
plates containing 5% sucrose and the number of colonies on each type of plate were 
counted. The percentage of Ap R Kn R colonies which grew on sucrose-containing plates 

25 relative to the number of Ap R colonies was 1% (1 x 10 3 /1 x 10 5 ). Restriction enzyme 

digestion of plasmid DNA isolated from individual Ap R Kn R colonies which grew on 
sucrose-containing plates confirmed that the desired fusions had been generated. These 
results indicate that the in vivo gene trap method can be used to recombine a gene of 
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interest carried on a pUNI-D vector into an expression vector using host cells that 
constitutively express the Cre protein. 

In addition to providing a means for recombining a gene of interest carried on 
a pUNI-D vector into an expression vector using host cells that constitutively express 
5 the Cre protein, the in vivo gene trap method provides a means to transfer a gene of 

interest contained on a linear DNA molecule (e.g., a PCR product) that lacks a 
selectable marker into an expression vector(s). The desired PCR product is amplified 
using two primers, each of which encode a different lox site (a "/oxA" and "loxB" site 
such as a lox? and loxPSll site). A pUNI vector is constructed that contains (5 ? to 3') 

10 a lox A site, a counter-selectable marker such as the sacB gene and a loxB site (i.e., the 
two different lox sites flank the counter-selectable marker). This pUNI vector also 
contains a conditional origin of replication and an antibiotic resistance gene as 
described above and in Example 1 . The PCR product (/oxA-amplified sequence-/oxB) 
is recombined with the modified pUNI vector (which comprises /oxA-counter- 

15 selectable marker-/oxB) to create a pUNI vector containing the PCR product which 

now lacks the counter-selectable marker. This recombination event is selected for by 
growing the host cells in medium that kills the host if the counter-selectable gene is 
expressed. The PCR product in the pUNI vector (containing 2 lox sites) can then be 
placed under the control of the desired promoter element by recombining the 

20 pUNI/PCR product construct with the appropriate pHOST-D vector. 

EXAMPLE 8 

The Use Of Modified Lox? Sites To Increase Expression Of The Protein Of Interest 

The pUNI and pHOST constructs employed in the Univector Plasmid Fusion 
System were designed such that plasmid fusion resulted in the introduction of a lox 
25 site between the promoter and the gene of interest. Lox? sites consist of two 13 bp 

inverted repeats separated by an 8 bp spacer region [Hoess et al (1982) Proc. Natl 
Acad. ScL USA 79:3398 and U.S. Patent No. 4,959,317]. Transcripts of the gene of 
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interest produced from a pUNI-pHOST fusion construct comprising a lox? site may 
have two 13 nucleotide perfect inverted repeats within the 5' untranslated region 
(UTR) that have the potential to form a stem-loop structure (this will occur in those 
cases where pHOST does not encode an affinity domain at the amino-terminus of the 
5 fusion protein). It is currently believed that the ribosome scanning mechanism is the 

most commonly used mechanism for initiation of translation in eukaryotes {e.g., yeast 
and mammalian cells). Using this mechanism, the ribosome binds to the 5' cap 
structure of the mRNA transcript and scans downstream along the 5' UTR searching 
for the first ATG or translation start codon. Without limiting the present invention to 

10 any particular mechanism, it is possible that a stem-loop structure formed by the 
presence of a lox? sequence on the 5' UTR of the mRNA encoding the protein of 
interest would block or reduce the efficiency of ribosome scanning and thus the 
translation initiation step could be impaired. There is evidence that stem-loop 
structures in the 5' UTR of particular mRNAs reduce the efficiency of translation in 

15 eukaryotes [see, e.g., Donahue et al. (1988) Mol Cell Biol 8:2964 and Yoon et al 

Genes and Dev. (1992) 6:2463]. It is noted that no evidence suggests that the 
presence of a stem-loop structure in the coding region (as opposed to the 5' UTR) of a 
transcript negatively affects its ability to be translated. It is likely that the energy of 
protein synthesis is sufficient to overcome secondary structures present in mRNAs. 

20 Indeed the data presented in Example 5 shows that a GST-SKP1 fusion construct 

produced using the Univector Fusion System (i.e., the construct contains a lox? site 
between the sequences encoding the Gst and Skpl domains) produced the same level 
of fusion protein as did a conventional construct encoding a Gst-Skpl fusion protein 
which lacks the lox? sequence. Therefore, concerns over the presence of a stem-loop 

25 structure caused by the presence of a lox sequence in a transcript encoded by a pUNI- 

pHOST fusion construct are limited to those constructs that do not generate fusion 
proteins. 

If low levels of expression are observed when a gene of interest is expressed 
from a pUNI-pHOST fusion constructs comprising lox sequences that comprise perfect 
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13 bp inverted repeats (e.g., fax?), pUNI and pHOST constructs containing mutated 
fax? sequences are employed. The mutated fax? sequences comprise point mutations 
that create mismatches between the two 13 bp inverted repeat sequences within the 
loxP site that disrupt the formation of or reduce the stability of a stem loop structure. 
5 Specifically, two modified loxP sites were designed that have mismatches at different 

positions in the inverted repeats located within a fax? site. The 13 bp inverted repeats 
are binding sites for the Cre protein; thus, each fax? site has two binding sites for Cre. 
For the purpose of discussion, these two binding sites are referred to as L and R (left 
and right). The wild-type fax? site is designed L(0)-R(0) wherein "0" indicates the 

10 absence of a mutation (i.e., the wild-type sequence). Two derivatives of the wild-type 

fax? sequence were designed and termed lox?2 and fax?3. The sequence of lox?2 
(SEQ ID NO:13), lox?3 (SEQ ID NO:14), as well as the wild-type fax? sequence 
(SEQ ID NO:12) are shown in Figure 13. Lox?2 is placed on the pUNI-10 construct 
(in place of the wild-type fax? site) and fax?3 is placed on the pHOST construct. 

15 Lox?2 has repeats designated L(3,6)-R(0) which indicates that the third and 

sixth nucleotides of the left repeat are mutated; thus, a mismatch is introduced at the 
third and sixth positions between the L and R repeats of the fax?2 site. Lox?3 has 
repeats designated L(0)-R(9) which indicates that the ninth nucleotide on the right 
repeat sequence is mutated to introduce a mismatch at the ninth position between the L 

20 and R repeats of the lox?3 site. Fusion between the fax?2 site on the pUNI construct 

and the lox?3 site on the pHOST construct will generate a hybrid fax?23 site [L(3,6)- 
R(9)] located between the promoter and the gene of interest and a wild-type loxP site 
[L(0)-R(0)] at the distal junction. Thus, the lox?23 site (SEQ ID NO: 15) in the 5' 
UTR will have three mismatches distributed at positions 3, 6 and 9 between the 13 

25 nucleotide inverted repeats which are expected to strongly destabilize the formation of 
the stem-loop structure. Other mutated fax? sequences suitable for disruption of the 
stem-loop structure will be apparent to those skilled in the art; therefore, the present 
invention is not limited to the use of the lox?2 and lox?3 sequences for the purpose of 
disrupting stem-loop formation on the 5' UTR of transcripts produced from pUNI- 
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pHOST fusion constructs. The suitability of any pair of mutated lox sites for use in 
the Univector Fusion system may be tested by placing one member of the pair on a 
pUNI vector and the other member on a pHOST construct. The two modified vectors 
are then recombined in vitro as described in Example 4 and the fusion reaction mixture 
5 is used to transform E. coli cells and the transformed cells are plated on selective 

medium {e.g., on LB/ Amp and LB/Kan plates) in order to determine the efficiency of 
recombination between the two mutated lox sites (Example 4). The efficiency of 
recombination between the two mutated lox sites is compared to the efficiency of 
recombination between two wild-type lox? sites. Any pair of two different mutant lox 
10 sites that recombines at a rate that is about 5% or greater than that observed using two 

lox? sites is a useful pair of mutated lox sites for use in avoiding the formation of a 
stem-loop structure on the 5' UTR of the mRNA transcribed from the pUNI/pHOST 
fusion construct. 

A strategy as described above was employed to determine if the reduced 
15 expression observed with the SKPl ORF under control of the GALl promoter as 

described in Example 5 could be improved with mutated lox sites. A series of lox 
sites designed to reduce the stability of the stem-loop were employed. These, together 
with a control scrambled site, /arS, were placed between the GALl promoter and the 
lacZ reporter gene and P-galactosidase expression was measured. Mutations that 
20 decreased stem-loop stability tended to express better and one mutant, /oxP 1369 , did not 

display any inhibitory effects. This mutant also retained 25% of the wild-type 
recombination efficiency and has been designated loxH (i.e., for host). The 
oligonucleotides used to generate the loxH site are based on the loxH sequence 5'- 
ATTACCTCATATAGCATACATTATACGAAGTTAT-3' (SEQ ID NO:32). LoxH 
25 was further tested by using it to place MYC-RNR4 under GALl control and showed no 
translational interference, as shown in Figure 17 (compare lanes 2, 3, and 4). LoxWs 
25% recombinational efficiency is well within the range useful for UPS-mediated 
plasmid constructions. Thus, it is recommended that loxH be used in pHOST recipient 
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vectors intended for transcriptional fusions to maximize expression, while loxP should 
be used for all other applications because of its higher recombination efficiency. 

It will be apparent to those skilled in the art that a similar strategy can be 
employed for the modification of frt sites when the FLP recombinase is employed for 
5 the recombination event. The frt site, like lox sites, contains two 13 bp inverted 

repeats separated by an 8 bp spacer region. 

EXAMPLE 9 

Precise ORF Transfer (POT) 

In order to transfer only the gene of interest from the Univector to the Host 

10 vector, the present invention provides a second recombination event that allows a 

resolution of the UPS generated heterodimer. A schematic representation of the POT 
reaction is shown in Figure 20. In one embodiment of the present invention, a R- 
recombination site, RS, was placed after the cloning site in pUNI (i.e., pUNI-20) such 
that any gene inserted into pUNI-20 would be flanked on the 5' side by lox? and on 

15 the 3' side by RS, although the present invention contemplates the use of any other 

second recombination system (e.g., the Res system). Host recipient vectors must also 
contain lox and RS elements in the correct order. The initial fusion event is catalyzed 
by Cre by UPS. The second reaction can be catalyzed in vitro by incubation with 
purified R-recombinase (Araki et al, supra) or in vivo by transformation into a strain 

20 (e.g., BUN 15) expressing the R-recombinase under tac control on a Ts replication 

plasmid (e.g., pML66) that is lost when cells are plated at 42°C. POT works 
efficiently as a two step reaction in vivo or in vitro. Efficient resolution in vivo 
without a selection for the second recombination event requires incubation in LB plus 
IPTG after transformation prior to plating on selective media. An incubation of 1 h 

25 and 4 h gave 3% and 15% recombinants, respectively, which showed complete loss of 

the pUNI backbone through recombination between RS sequences. In vitro 
recombination catalyzed by the R recombinase achieved 30% recombinants. 
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The efficiency of recovering plasmids that have undergone POT can be greatly 
enhanced through the use of a recipient vector in which a counter-selectable marker is 
placed between the lox? and RS sites. For this purpose, the present invention utilized 
the OX 174 E gene which is toxic when expressed in E. coli unless the host cell lacks 
5 the slyD gene [Maratea et al. (1985) Gene 40:39]. pAS2-E, a two hybrid bait vector 

derived from pAS2 [Durfee et al (1994) Gene. & Dev. 7:555] which contains in a 5' 
to 3' order loxP, E under control of the tac promoter, and an RS site, was fused with 
pUNI-20, containing the SKP\ gene and the co-integrant was selected by 
transformation into CXI (slyD~). This co-integrant was then transformed into BUN 15 

10 cells expressing the R recombinase and resolution events were isolated by selecting for 
Ap r in the presence of IPTG to induce the E protein. Since BUN15 is slyD*, pAS2-E 
alone cannot survive in it because of toxicity due to E expression. However, when 
pAS2-E is fused to pUNI-20 derivatives, it can transform that strain because 
subsequent R-dependent site-specific recombination between RS sites will eliminate 

1 5 both the pUNI backbone and E. This results in the replacement of E with the 

corresponding region from pUNI. One hundred percent (24 of 24) Ap r transformants 
resulting from the transformation of the pAS2-E-pUNI-20-SKPl fusion plasmid 
showed precise transfer of the SKP\ gene from pUNI-20 into pAS2-E with only 1 hr 
incubation prior to plating on selective media. 

20 Transformation of a heterodimeric plasmid with E flanked by RS sites into 

BUN 15 gave a transformation several orders of magnitude greater than transformation 
of the pAS2-E plasmid itself. This demonstrated that POT can be achieved in a single 
step by direct transformation of a UPS reaction into BUN 15 (i.e., rather than a two- 
step process). pUNI-20-SKPl and pAS2-E were incubated with Gst-Cre in a standard 

25 UPS reaction and the reaction mixture was transformed directly into BUN 15 and AP r 

transformants were selected at 42°C after an hour incubation. One hundred percent (20 
of 20) of Ap r transformants were found to have undergone POT with SKPl replacing 
the E gene in pAS2-E as determined by restriction digestion with PvuII, as shown in 
Figure 21. The sample shown in Figure 21 was generated from plasmid DNA isolated 



- 68 - 




from 10 different Ap r transformants, digested as described above along with two 
parental plasmids, PI (pUNI-20-SKPl) and P2 (pAS2-E) and I (the UPS generated 
pUNI-20-SKPl-pAS2-E recombination intermediate). Precise ORF transfer resulted in 
the generation of a novel 800 bp PvuII fragment indicated by the arrowhead. 
5 For POT assays, BUN 15 cells were grown overnight in LB containing 

spectinomycin (50 jig/ml) at 30°C. BUN15 cells were diluted 1 to 100 in fresh media 
LB/Spec media containing 0.3 mM IPTG and grown to OD of 0.5. Electrocompetent 
cells were prepared as recommended (Biorad). Forty |il of competent cells were used 
in each transformation. After the electrotransformation, cells were incubated in LB 
10 plus IPTG for 1-8 hr for recovery before being plated on LB/Amp/IPTG ImM and 

incubated at 42°C. 

EXAMPLE 10 

Library Transfer Using UPS 

The ability to use the methods and compositions of the present invention for 
1 5 generating and subcloning entire nucleic acid libraries is demonstrated in this Example. 

A random shear S. cerevisiae genomic library was made in pUNI-10 using the Xhol- 
adaptor strategy [Elledge et al (1991) Proc. Natl Acad. Set 88:1731]. This library 
had 5xl0 5 recombinants with 80% inserts ranging from 3 kb to 8 kb. This library was 
fused to pRS425-/ox, a URA3 2\i plasmid, using UPS and 1.6xl0 6 recombinant fusion 
20 plasmids were recovered. This library was used to transform an S. cerevisiae cdc4-J 
mutant strain Y543 and Ura + transformants were selected at 34°C, the non-permissive 
temperature of cdc4-L Of 31 plasmids capable of conferring growth at 34°C, three 
classes were recovered. One class was CDC4 as expected, the second was SKP1, and 
the third was CLB3. SKP\ and CLB4, a cyclin closely related to CLB3, had been 
25 previously shown to suppress cdc4-l mutants when overexpressed from the GAL 

promoter [Bai et al (1994) EMBO J. 3:6087; and Bai et al, supra]. These 
experiments demonstrate the feasibility of library transfer using UPS. In cases where a 
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cDNA expression library is created, such as for the two hybrid system, once clones 
have been isolated, they can be rapidly converted back into simple Univector clones by 
Cre recombination in vivo. Using UPS, these plasmids can now be rapidly fused with 
any of a series of pHOST expression vectors for future analytical needs. 

5 EXAMPLE 11 

General Material and Methods 

This Example provides general materials and methods used throughout the 
experiments discussed above and below. 

L Media, Enzymes, and Chemicals 

10 For drug selections, LB plates or liquid media were supplemented with either 

kanamycin (40 ^ig/ml) or ampicillin (100 jag/ml). When necessary, isopropyl P-D- 
thiogalactoside (IPTG) was added to a final concentration of 0.3 mM and X-Gal 
(Sigma) was used at 80 jag/ml. Yeast growth media and plates were made according 
to Rose et al. [Rose et al. (1990) Laboratory course manual for methods in yeast 

15 genetics, Cold Spring Harbor, New York, Cold Spring Harbor Laboratory Press]. 

Restriction endonucleases, large (klenow) fragment of E. coli DNA polymerase I, T4 
polynucleotide kinase, T4 DNA polymerase, T4 DNA ligase were purchased from New 
England Biolabs. Drugs were purchased from Sigma if not otherwise specified. 

II. Bacterial and Yeast Strains 

20 Kcoli BW23474 [A/ac-169, robAl, m>C510, hsdR.514, uidA(AMluI)::pir-l 16, 

endA, recAl] and BW23473 [A/ac-169, robAl, m?C510, hsdRSXA, uidA(AM!uI)::pir\ 
endA, recAl] (Metcalf et al, supra) was a gift of B. Wanner and was used as host for 
propagation of all Univector based plasmids. BUN 10 [hisG4 thr-1 leuB6 t lacYl 
kdgKSl A(gpt-proA)62 rpsL31 tsx33 supE44 recB21 recC22 sbcA23 hsdR::cat-pir- 

25 116(Cm R )] was used for homologous recombination experiments. BUN 13 which has 



- 70 - 




ere under the control of the lac promoter is JM107 lysogenized with X LC (aadA lac- 
ere). BUN 15 is XL1 blue containing pML66(to>R, SP 1 ) and was used for the in vivo 
RS recombination assays. E. coli JM107 or DH5a [Sambrook et al (1989) Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Lab., Cold Spring Harbor, NY, 
5 2nd Ed.] were the transformation recipients for all other plasmid construction, 

including those made by UPS. E. coli BL21 was used as the host for bacterial 
expression studies. CXI (ara leu purE gal trp his argG rpsL thi-1 supE lacfi slyDl) 
was used for propagation of E expression clones. S. cerevisiae Y80 [Zhou and Elledge 
(1992) Genetics 131:851] was used for yeast expression studies and Y543 (as Y80 but 
10 cdc4~l) was used for cdc4 suppression (Bai et al, 1994, supra). 

III. Plasmid Construction 

The construction of several of the plasmids used in the examples of the present 
invention are provided below. These examples are provided to illustrate strategies and 
general methods used in making plasmids for use in the UPS. However, these specific 
15 plasmids and methods of construction are not required to practice the present 

invention. 

For the Gst-Cre expression construct, pQL123, the ere ORF was amplified by 

PCR and an Ncol site placed at the first ATG using primers 

5 ' -CCATGGCC AATTTACTGACCGTACAC-3 ? (SEQ ID NO:21) and 
20 5'-CCCGGGCTAATCGCCATCTTCCAGC-3' (SEQ ID NO:20). The PCR product 

was cloned into pCR™II (Invitrogen) and subcloned as a NcoI-EcoRI fragment into 

NcoI-EcoRI digested pGEX-2Tkcs to create pQL123. 

The pHOST plasmid pQL103 was made by deleting one loxP site from 

pSE1086, which contains a XhoI-lox?-NotI-lox?-S3ll cassette, by digestion with NotI 
25 and Sail, filling in the ends with klenow and religation. The 590 bp NcoI-BamHI 

fragment containing the S. cerevisiae SKP1 ORF was subcloned from pCB149 into 

NcoI-BamHI-cut pUNI-10 to create pQL130(pUNI-SKPl). 
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A second subclone of SKP1 is pML73 which contains the same 5' end of SKP1 
but an additional 800 bp of genomic DNA to the next BamHI site at the 3' end cloned 
into pUNI-20. pML73 was used for the POT experiments. An oligo linker containing 
loxP and flanked by Ncol and BamHI overhangs was made by annealing two oligos 5'- 
5 CATGGCTATAACTTCGTATAGCATACATTATACGAAGTTATG-3' (SEQ ID 

NO:22) and 5 ' -GATCCATAACTTCGTATAATGTATGCTATACGAAGTTAT-3 ' 
(SEQ ID NO:23), and then ligating into Ncol and BamHI digested pGEX-2TKcs to 
create pHB2-GST. The MYC r RNR4 gene was subcloned from pMH176 [Huang and 
Elledge (1997) Mol. Cell. Biol. 17:6105] as a XhoI-SacI fragment into Xhol-Sacl- 

10 cleaved pUNI-10 to create pQL248, or into Sall-SacI digested pBAD104, a GALI 

expression vector to create the control lacking loxP. Two pBAD104 derived recipient 
vectors, pQL138 and pQL193, were constructed by insertion of either a wild type loxP 
of /oxP 369 sequence into the polylinker using primer pairs: 
5'-TCGAGACGTCATAACTTCGTATAGCATACATTATACGAAGTTATGC-3' 

15 (SEQ ID NO:24) and 

5 ' -GCCGC AT AACTTCGTATA ATGT ATGCT AT ACGATGTTATGACGTC-3 ' (SEQ 
ID NO:25) (pQL138), or 

5 ' -C ATGGCT ATAACTTCGTATAGCATAC ATT ATACGAAGTTATG-3 ' (SEQ ID 
NO:26) and 

20 5 ' -GATCC ATAACTTCGTATAATGT ATGCTATACGAAGTTATAGC-3 ' (SEQ ID 

NO:27) (pQL193). Two GALl:MYC r RNR4 constructs were made by UPS between 
pQL248 and pQL138 or pQL193. 

For the construction of pQL269 (lac-cre aadA on a Ts pSClOl ori), the EcoRI- 
PvuII fragment from pQL114 containing aadA and the lac-cre gene fusion was ligated 

25 to a Bgll (made blunt by T4 polymerase)-£co/?7 fragment from pINT-ts [Hasan et al. 

(1994) Gene 150:51] containing the Ts replication origin and transformants were 
screened for Sp R and Ts growth at 42°C. A plasmid with those properties was 
designated pQL269. 
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pML66 was constructed by ligating the EcoRI-Sall (blunt) fragment containing 
the tac promoter driving the R recombinase from pNN115 (Araki et al, supra) into 
EcoRI-PstI (blunt) cleaved pQL269. This spectinomycin resistant plasmid expresses R 
protein in the presence of IPTG and is lost from cells grown at 42°C because of a 
temperature sensitive replication mutation. 

pUNI-Amp was made by placing the bla gene from pUC19 in place of the neo 
gene on pUNI-20 by generating a PCR product of bla and ligating that into Mhd-Nhel 
(blunt) cleaved pUNI-20. The subcloning of the triple MYC tag into pUNI-Amp was 
accomplished by PCR amplification of the 3xMYC tag present of pJBN48 by the 
primers MZL154, 5 ' - AAATTTCTCG AGGCTCTG AGC A A A AGCTC AT-3 ' (SEQ ID 
NO:28) and MZL155, 

5 ' -T AT ATAT AGCGGCCGCTT AATT A AG ATCCTCCTCGG ATA-3 ' (SEQ ID 
NO:29), followed by cleavage of the PCR product with Xhol and NotI and ligation 
into XhoI-NotI cleaved pUNI-Amp to generate pML74. Sequence of the PCR primers 
used to amplify the 3xMYC tag from pML74 for tagging the C-terminus of SKP1 by 
homologous recombination were primer A (MZL160) 

5 ' -CC AG AGGAGG AGGCTGCC ATT AGGCGTG AA A ATG AATGGGCTG AAG ACCG 
TCTGAGC A AAAGCTCATTTC-3 ' (SEQ ID NO:30) and primer B (MZL161) 
5 ' -GG ATAT AGTTCCTCCTTTC AGC (SEQ ID NO:31). 

pAS2-E was constructed by first placing a synthetic lox? site between the Ncol- 
Sall sites of pAS2 to make pAS2-/ox, and then generating a ^-containing fragment 
with the following features: 5' Xhol site, tac promoter driving E, Spel site 3' and 
ligated the Xhol-Spel fragment together with a Spel-PstI synthetic RS fragment into 
XhoI-PstI cleaved pAS2-lox to make pAS2-E (pML71). 

IV. jS-galactosidase Assays 

Yeast cells expressing the GALLiacZ reporter constructs containing different 
lox? sequences were grown at 30°C to mid-log phase (OD 600 = 0.5-0.6) in SC-Ura 
media containing 2% raffinose, galactose was added to 2% final, and cells were 
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incubated at 30°C for two hours, p-galactosidase activities were measured as described 
by Zhou and Elledge (Zhou and Elledge, supra). 

EXAMPLE 12 

Construction of BUN 13 

5 This Example describes the construction of BUN 13, a lambda lysogen with ere 

under lac control. pSE356 contains a cassette consisting of the Tn5 neo gene, the lac 
promoter, and a polylinker sequence surrounded by stretches of X DNA sequence. 
pQLl 14, the plasmid used to recombine the ere gene into X 9 was constructed in two 
steps. First, the BamHI-Hindffl (made blunt by T4 DNA polymerase) fragment 

1 0 containing the spectinomycin resistance gene aadA from pDPT270 [Taylor and Cohen 
(1979) J. Bacteriol 137:92] was subcloned into BamHI-SphI (made blunt by T4 DNA 
polymerase digested pSE356) to create pQL102, replacing neo with aadA, Secondly, a 
NotI site was engineered at the 5' end of the ribosomal binding site of the ere gene by 
PCR using primers 5 ' -GCGGCCGCTGAGTGTT AAATGTCC A ATT-3 ' (SEQ ID 

15 NO:19) and S'-CCCGGGCTAATCGCCATCTTCCAGC-S' (SEQ ID NO:20). The 

PCR product was cloned into pCR™II and subcloned as a Notl-EcoRI fragment into 
Notl-EcoRI digested pQL102 to create pQL114, placing ere under lac control adjacent 
to aadA and flanked by X DNA sequence. A, KC (Elledge et aL, supra) was amplified 
on JM107 containing pQL114 and the resulting phage lysate containing the desired 

20 recombinant X LC phage was used to infect JM107. Sp'Kn 5 lysogens were selected and 

tested for Cre expression and the ability to perform UPS. One strain with those 
properties was designated BUN13. 

It is clear from the above that the present invention provides methods for the 
subcloning of nucleic acid molecules that permit the rapid transfer of a target nucleic 
25 acid sequence (e.g., a gene of interest) from nucleic acid molecule to another in vitro 

or in vivo without the need to rely upon restriction enzyme digestions. 
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All publications and patents mentioned in the above specification are herein 
incorporated by reference. Various modifications and variations of the described 
method and system of the invention will be apparent to those skilled in the art without 
departing from the scope and spirit of the invention. Although the invention has been 
5 described in connection with specific preferred embodiments, it should be understood 

that the invention as claimed should not be unduly limited to such specific 
embodiments. Indeed, various modifications of the described modes for carrying out 
the invention which are obvious to those skilled in molecular biology or related fields 
are intended to be within the scope of the following claims. 
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CLAIMS 



We claim: 

1 . A method for the recombination of nucleic acid constructs, comprising: 

a) providing: 

i) a first nucleic acid construct comprising, in operable 
order, an origin of replication, a first sequence-specific recombinase 
target site, and a nucleic acid of interest; 

ii) a second nucleic acid construct comprising, in operable 
order, an origin of replication, a regulatory element and a second 
sequence- specific recombinase target site adjacent to and downstream 
from said regulatory element; and 

iii) a site-specific recombinase; 

b) contacting said first and said second nucleic acid constructs with 
said site-specific recombinase under conditions such that said first and second 
nucleic acid constructs are recombined to form a third nucleic acid construct, 
wherein said nucleic acid of interest is operably linked to said regulatory 
element. 

2. The method of Claim 1, wherein said regulatory element comprises a 
promoter element. 

3. The method of Claim 1, wherein said regulatory element comprises a 
fusion peptide. 

4. The method of Claim 3, wherein said fusion peptide comprises an 
affinity domain. 
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5. The method of Claim 3, wherein said fusion peptide comprises an 
epitope tag. 

6. The method of Claim 1, wherein said nucleic acid of interest comprises 

a gene. 

5 7. The method of Claim 1, wherein said first nucleic acid construct further 

comprises a selectable marker. 

8. The method of Claim 1, wherein said second nucleic acid construct 
further comprises a selectable marker. 

9. The method of Claim 1, wherein said first nucleic acid construct further 
10 comprises a prokaryotic termination sequence. 

10. The method of Claim 1 ? wherein said first nucleic acid construct further 
comprises a eukaryotic polyadenylation sequence. 

1 1 . The method of Claim 1 , wherein said first nucleic acid construct further 
comprises a conditional origin of replication. 

15 12. The method of Claim 1, wherein said first sequence-specific 

recombinase target site is selected from the group consisting of /oxP, lox?2, /oxP3 ? 
/oxP23, /avPSll, loxB, loxQl, loxL, loxR, /oxA86, hxAU7 9 frt 9 dif 9 loxU and att. 

13. The method of Claim 1, wherein said second sequence-specific 
recombinase target site is selected from the group consisting of lox? 9 loxP2 9 /oxP3, 
20 /oxP23 f /oxPSll, loxB 9 loxC2, loxL, loxR, /ojcA86, loxAM 9 frt, dif, loxU and att. 
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14. The method of Claim 1, wherein said first nucleic acid construct further 
comprises a polylinker. 

15. The method of Claim 1, wherein said contacting said first and said 
second nucleic acid constructs with said site-specific recombinase comprises 
introducing said first and said second nucleic acid constructs into a host cell under 
conditions such that said third nucleic acid construct is capable of replicating in said 
host cell. 

16. The method of Claim 15, wherein said site-specific recombinase is 
encoded by said host cell. 

1 7. The method of Claim 1 , wherein said first nucleic acid construct further 
comprises a third sequence-specific recombinase target site and said second nucleic 
acid constructs further comprises a fourth sequence- specific recombinase target site. 

18. The method of Claim 17, wherein said first sequence-specific 
recombinase target site and said third sequence-specific recombinase target site in said 
first nucleic acid construct are located on opposite sides of said nucleic acid of interest. 

19. The method of Claim 17, wherein in said third and fourth sequence- 
specific recombinase target sites are selected from the group consisting of RS sites and 
Res sites. 

20. The method of Claim 1, wherein said first nucleic acid construct further 
comprises a third sequence-specific recombinase target site and said second nucleic 
acid constructs further comprises a fourth sequence-specific recombinase target site, 
wherein the method further comprises providing a second site-specific recombinase and 
step c) contacting said third nucleic acid construct with said second site-specific 
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recombinase under conditions such that said third nucleic acid construct is recombined 
to form a fourth and a fifth nucleic acid construct. 



A recombined nucleic acid construct prepared according to the method 

A method for the recombination of nucleic acid constructs, comprising: 

a) providing: 

i) a vector; 

ii) a linear nucleic acid molecule comprising a sequence 
complementary to at least a portion of said vector; and 

iii) an E. coli host cell, wherein said host cell comprises an 
endogenous recombination system, a loss of function rec mutation, a 
suppressor, and a loss of function endogenous restriction modification 
system mutation; and 

b) introducing said vector and said linear nucleic acid molecule into 
said host cell under conditions such that said linear nucleic acid molecule and 
said vector are recombined to form a recombinant nucleic acid construct. 

23. The method of Claim 22, wherein said loss of function rec mutation is 
selected from the group consisting of recBC and recD. 

24. The method of Claim 22, wherein said suppressor comprises sbc. 

25. The method of Claim 22, wherein said loss of function endogenous 
restriction modification system mutation comprises hsdR. 

26. A method for the cloning of nucleic acid libraries, comprising: 
a) providing: 



21. 
of Claim 1 . 

22. 
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i) a plurality of first nucleic acid constructs comprising, in 
operable order, an origin of replication, a first sequence-specific 
recombinase target site, and a nucleic acid member from a nucleic acid 
library; 

ii) a plurality of second nucleic acid constructs comprising, 
in operable order, an origin of replication, a regulatory element and a 
second sequence-specific recombinase target site adjacent to and 
downstream from said regulatory element; and 

iii) a site-specific recombinase; 

b) contacting said plurality of first and second nucleic acid 
constructs with said site-specific recombinase under conditions such that said 
plurality of first and second nucleic acid constructs are recombined to form a 
plurality of third nucleic acid constructs, wherein said nucleic acid members 
from said nucleic acid library are operably linked to said regulatory elements. 

27. A nucleic acid library prepared according to the method of Claim 26. 

28. A method for the directional cloning of a nucleic acid molecule, 
comprising: 

a) providing: 

i) first and second portions of a regulatory element; 

ii) a first nucleic acid molecule comprising said first portion 
of said regulatory element; and 

iii) a second nucleic acid molecule comprising said second 
portion of said regulatory element; and 

b) combining said first and said second nucleic acid molecules to 
produce a third nucleic acid molecule under conditions whereby an intact 
regulatory element is produced from the combination of said first and said 
second portions of said regulatory element, wherein the presence of said intact 
regulatory element in said third nucleic acid molecule indicates a direction of 
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cloning of said first nucleic acid molecule with respect to said second nucleic 
acid molecule. 

29. The method of Claim 28, wherein said regulatory element comprises a 
lacO site. 

5 30. A method for regulated recombination in host cells that constitutively 

express a recombinase, comprising: 

a) providing: 

i) a host cell expressing a recombinase; 

ii) a first nucleic acid construct comprising an origin of 
10 replication, a first site-specific recombinase site, a second site-specific 

recombinase site that differs in sequence from said first site-specific 
recombinase site such that said recombinase will not initiate 
recombination between said first and second site-specific recombinase 
sites, and a selectable marker gene between said first and second site- 
15 specific recombinase sites; and 

iii) a second nucleic acid construct comprising an origin of 
replication, a third site-specific recombinase target site, and a fourth 
site-specific recombinase target site that differs in sequence from said 
third site-specific recombinase site such that said recombinase will not 

20 initiate recombination between said third and fourth site-specific 

recombinase sites; and 

b) introducing said first and second nucleic acid constructs into said 
host cell under conditions such that said first and second nucleic acid constructs 
are recombined. 

25 31. The method of Claim 30, further comprising the step of selecting for a 

desired recombinant nucleic acid molecule using said selectable marker. 
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32. The method of Claim 30, wherein said first nucleic acid construct is a 
Univector. 

33. The method of Claim 30, wherein said second nucleic acid construct is a 
Univector. 

34. A host cell expressing a recombinant nucleic acid construct prepared 
according to the method of Claim 30, wherein said host cell constitutively expresses a 
recombinase. 

35. A method for the recombination of nucleic acid constructs, comprising: 

a) providing: 

i) a first nucleic acid construct comprising a loxYL site; 

ii) a second nucleic acid construct comprising a loxR site; 

and 

iii) a site-specific recombinase; and 

b) contacting said first and said second nucleic acid constructs with 
said site-specific recombinase under conditions such that said first and second 
nucleic acid constructs are recombined. 

36. A recombined nucleic acid construct prepared according to the method 
of Claim 35. 
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ABSTRACT 

The present invention provides compositions, including vectors, and methods 
for the rapid subcloning of nucleic acid sequences in vivo and in vitro. In particular, 
the invention provides vectors used to contain a gene of interest that comprise a 
5 sequence-specific recombinase target site. These vectors are used to rapidly transfer 

the gene or genes of interest into any vector that contains a sequence-specific 
recombinase target site located downstream of a regulatory element so that the gene of 
interest may be regulated. 
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SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

AATTCTGTCA GCCGTTAAGT GTTCCTGTGT CACTGAAAAT TGCTTTGAGA GGCTCTAAGG 
60 

GCTTCTCAGT GCGTTACATC CCTGGCTTGT TGTCCACAAC CGTTAAACCT TAAAAGCTTT 
120 

AAAAGCCTTA TATATTCTTT TTTTTCTTAT AAAACTTAAA ACCTTAGAGG CTATTTAAGT 
180 

TGCTGATTTA TATTAATTTT ATTGTTCAAA CATGAGAGCT TAGTACGTGA AACATGAGAG 
240 

CTTAGTACGT TAG C CAT GAG AGCTTAGTAC GTTAGCCATG AGGGTTTAGT TCGTTAAACA 
300 

TGAGAGCTTA GTACGTTAAA CATGAGAGCT TAGTACGTGA AACATGAGAG CTTAGTACGT 
360 

ACTATCAACA GGTTGAACTG CTGATCAACA GATCCTCTAC GCGGCCGCGG TACCATAACT 
420 

TCGTATAGCA T AC AT TAT AC GAAGTTATCT GGAATTCCCC GGGCTCGAGA ACATATGGCC 
480 

ATGGGGATCC GCGGCCGCAA TTGTTAACAG ATCCGTCGAC GAGCTCGCTA TCAGCCTCGA 
540 

CTGTGCCTTC TAGTTGCCAG CCATCTGTTG TTTGCCCCTC CCCCGTGCCT TCCTTGACCC 
600 

TGGAAGGTGC CACTCCCACT GTCCTTTCCT AATAAAATGA GGAAATTGCA TCGCATTGTC 
660 

TGAGTAGGTG TCATTCTATT CTGGGGGGTG GGGTGGGGCA GGACAGCAAG GGGGAGGATT 
720 

GGGAAGACAA TAGCAGGCAT GCTGGGGATT CTAGAAGATC CGGCTGCTAA CAAAGCCCGA 
780 

AAGGAAGCTG AGTTGGCTGC TGCCACCGCT GAGCAATAAC TAGCATAACC CCTTGGGGCC 
840 

TCTAAACGGG TCTTGAGGGG TTTTTTGCTG AAAGGAGGAA CTATATCCGG ATATCCCGGG 
900 

GTGGGCGAAG AACTCCAGCA TGAGATCCCC GCGCTGGAGG ATCATCCAGC CGGCGTCCCG 
960 

GAAAACGATT CCGAAGCCCA ACCTTTCATA GAAGGCGGCG GTGGAATCGA AATCTCGTGA 
1020 

TGGCAGGTTG GGCGTCGCTT GGTCGGTCAT TTCGAACCCC AGAGTCCCGC TCAGAAGAAC 
1080 




TCGTCAAGAA GGCGATAGAA GGCGATGCGC 
1140 

ACGAGGAAGC GGTCAGCCCA TTCGCCGCCA 
1200 

GCTATGTCCT GATAGCGGTC CGCCACACCC 
1260 

CGGCCATTTT CCACCATGAT ATTCGGCAAG 
1320 

TCGCCGTCGG GCATGCGCGC CTTGAGCCTG 
1380 

TGCTCTTCGT CCAGATCATC CTGATCGACA 
1440 

TCGATGCGAT GTTTCGCTTG GTGGTCGAAT 
1500 

CGCCGCATTG CATCAGCCAT GATGGATACT 
1560 

AGATCCTGCC CCGGCACTTC GCCCAATAGC 
1620 

TCGAGCACAG CTGCGCAAGG AACGCCCGTC 
1680 

TCCTGCAGTT CATTCAGGGC AC CGGAC AGG 
1740 

TGCGCTGACA GCCGGAACAC GGCGGCATCA 
1800 

TAGCCGAATA GCCTCTCCAC CCAAGCGGCC 
1860 

ATCATGCGAA ACGATCCTCA TCCTGTCTCT 
1920 

ATCCTTGGCG GCAAGAAAGC CATCCAGTTT 
1980 

GGCGCCCCAG CTGGCAATTC CGGTTCGCTT 
2040 

CGCCATGTAA GCCCACTGCA AGCTACCTGC 
2100 




TGCGAATCGG GAGCGGCGAT ACCGTAAAGC 
AGCTCTTCAG CAATATCACG GGTAGCCAAC 
AGCCGGCCAC AGTCGATGAA TCCAGAAAAG 
CAGGCATCGC CATGGGTCAC GACGAGATCC 
GCGAACAGTT CGGCTGGCGC GAGCCCCTGA 
AGACCGGCTT CCATCCGAGT ACGTGCTCGC 
GGGCAGGTAG CCGGATCAAG CGTATGCAGC 
TTCTCGGCAG GAGCAAGGTG AGATGACAGG 
AGCCAGTCCC TTCCCGCTTC AGTGACAACG 
GTGGCCAGCC ACGATAGCCG CGCTGCCTCG 
TCGGTCTTGA CAAAAAGAAC CGGGCGCCCC 
GAGCAGCCGA TTGTCTGTTG TGCCCAGTCA 
GGAGAACCTG CGTGCAATCC ATCTTGTTCA 
TGATCAGATC TTGATCCCCT GCGCCATCAG 
ACTTTGCAGG GCTTCCCAAC CTTACCAGAG 
GCTGTCCATA AAACCGCCCA GTCTAGCTAT 
TTTCTCTTTG CGCTTGCGTT TTCCCTTGTC 



CAGATAGCCC AGTAGCTGAC ATTCATCCGG GGTCAGCACC GTTTCTGCGG ACTGGCTTTC 
2160 

TACGTGTTCC GCTTCCTTTA GCAGCCCTTG CGCCCTGAGT GCTTGCGGCA GCGTGAAGCT 
2220 




FIGURE 26B 

SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATG TCC CCT ATA CTA GGT TAT TGG AAA ATT AAG GGC CTT GTG CAA CCC 
48 

Met Ser Pro lie Leu Gly Tyr Trp Lys lie Lys Gly Leu Val Gin Pro 
15 10 15 

ACT CGA CTT CTT TTG GAA TAT CTT GAA GAA AAA TAT GAA GAG CAT TTG 
96 

Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

TAT GAG CGC GAT GAA GGT GAT AAA TGG CGA AAC AAA AAG TTT GAA TTG 
144 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

GGT TTG GAG TTT CCC AAT CTT CCT TAT TAT ATT GAT GGT GAT GTT AAA 
192 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr lie Asp Gly Asp Val Lys 
50 55 60 

TTA ACA CAG TCT ATG GCC ATC ATA CGT TAT ATA GCT GAC AAG CAC AAC 
240 

Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 

65 70 75 80 

ATG TTG GGT GGT TGT CCA AAA GAG CGT GCA GAG ATT TCA ATG CTT GAA 
288 

Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 

85 90 95 

GGA GCG GTT TTG GAT ATT AGA TAC GGT GTT TCG AGA ATT GCA TAT AGT 
336 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 110 

AAA GAC TTT GAA ACT CTC AAA GTT GAT TTT CTT AGC AAG CTA CCT GAA 
384 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

ATG CTG AAA ATG TTC GAA GAT CGT TTA TGT CAT AAA ACA TAT TTA AAT 
432 

Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

GGT GAT CAT GTA ACC CAT CCT GAC TTC ATG TTG TAT GAC GCT CTT GAT 
480 

Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 

145 150 155 160 



GTT GTT TTA TAC 
528 

Val Val Leu Tyr 



GTT TGT TTT AAA 
576 

Val Cys Phe Lys 
180 

TTG AAA TCC AGC 
624 

Leu Lys Ser Ser 
195 

ACG TTT GGT GGT 
672 

Thr Phe Gly Gly 
210 

GGA TCT CGT CGT 
720 

Gly Ser Arg Arg 
225 

CTG ACC GTA CAC 
768 

Leu Thr Val His 



GAT GAG GTT CGC 
816 

Asp Glu Val Arg 
260 

TTT TCT GAG CAT 
864 

Phe Ser Glu His 
275 

GCG GCA TGG TGC 
912 

Ala Ala Trp Cys 
290 

GAA GAT GTT CGC 
960 

Glu Asp Val Arg 
305 

GTA AAA ACT ATC 
1008 

Val Lys Thr lie 



CGG TCC GGG CTG 
1056 

Arg Ser Gly Leu 
340 




ATG GAC CCA ATG 

Met Asp Pro Met 
165 

AAA CGT ATT GAA 
Lys Arg lie Glu 

AAG TAT ATA GCA 

Lys Tyr lie Ala 
200 

GGC GAC CAT CCT 

Gly Asp His Pro 
215 

GCA TCT GTT GGA 

Ala Ser Val Gly 
230 

CAA AAT TTG CCT 

Gin Asn Leu Pro 
245 

AAG AAC CTG ATG 

Lys Asn Leu Met 

ACC TGG AAA ATG 

Thr Trp Lys Met 
280 

AAG TTG AAT AAC 

Lys Leu Asn Asn 
295 

GAT TAT CTT CTA 

Asp Tyr Leu Leu 
310 

CAG CAA CAT TTG 

Gin Gin His Leu 
325 

CCA CGA CCA AGT 
Pro Arg Pro Ser 



TGC CTG GAT GCG 

Cys Leu Asp Ala 
170 

GCT ATC CCA CAA 

Ala lie Pro Gin 
185 

TGG CCT TTG CAG 
Trp Pro Leu Gin 

CCA AAA TCG GAT 

Pro Lys Ser Asp 
220 

TCG CAT ATG CCC 

Ser His Met Pro 
235 

GCA TTA CCG GTC 

Ala Leu Pro Val 
250 

GAC ATG TTC AGG 

Asp Met Phe Arg 
265 

CTT CTG TCC GTT 
Leu Leu Ser Val 

CGG AAA TGG TTT 

Arg Lys Trp Phe 
300 

TAT CTT CAG GCG 

Tyr Leu Gin Ala 
315 

GGC CAG CTA AAC 

Gly Gin Leu Asn 
330 

GAC AGC AAT GCT 

Asp Ser Asn Ala 
345 




TTC CCA AAA TTA 

Phe Pro Lys Leu 
175 

ATT GAT AAG TAC 

lie Asp Lys Tyr 
190 

GGC TGG CAA GCC 

Gly Trp Gin Ala 
205 

CTG GTT CCG CGT 
Leu Val Pro Arg 

ATG GCC AAT TTA 

Met Ala Asn Leu 
240 

GAT GCA ACG AGT 

Asp Ala Thr Ser 
255 

GAT CGC CAG GCG 

Asp Arg Gin Ala 
270 

TGC CGG TCG TGG 

Cys Arg Ser Trp 
285 

CCC GCA GAA CCT 

Pro Ala Glu Pro 

CGC GGT CTG GCA 

Arg Gly Leu Ala 
320 

ATG CTT CAT CGT 

Met Leu His Arg 
335 

GTT TCA CTG GTT 

Val Ser Leu Val 
350 



ATG CGG CGG ATC 
1104 

Met Arg Arg lie 
355 

CAG GCT CTA GCG 
1152 

Gin Ala Leu Ala 
370 

ATG GAA AAT AGC 
1200 

Met Glu Asn Ser 
385 

GGG ATT GCT TAT 
1248 

Gly lie Ala Tyr 



AGG GTT AAA GAT 
1296 

Arg Val Lys Asp 
420 

ATT GGC AGA ACG 
1344 

lie Gly Arg Thr 
435 

CTT AGC CTG GGG 
1392 

Leu Ser Leu Gly 
450 

GGT GTA GCT GAT 
1440 

Gly Val Ala Asp 
465 

AAT GGT GTT GCC 
1488 

Asn Gly Val Ala 



CTG GAA GGG ATT 
1536 

Leu Glu Gly lie 
500 

GAT GAC TCT GGT 
1584 

Asp Asp Ser Gly 
515 

GTC GGA GCC GCG 
1632 

Val Gly Ala Ala 
530 




CGA AAA GAA AAC 

Arg Lys Glu Asn 
360 

TTC GAA CGC ACT 

Phe Glu Arg Thr 
375 

GAT CGC TGC CAG 

Asp Arg Cys Gin 
390 

AAC ACC CTG TTA 

Asn Thr Leu Leu 
405 

ATC TCA CGT ACT 
lie Ser Arg Thr 

AAA ACG CTG GTT 

Lys Thr Leu Val 
440 

GTA ACT AAA CTG 

Val Thr Lys Leu 
455 

GAT CCG AAT AAC 

Asp Pro Asn Asn 
470 

GCG CCA TCT GCC 

Ala Pro Ser Ala 
485 

TTT GAA GCA ACT 

Phe Glu Ala Thr 

CAG AGA TAC CTG 

Gin Arg Tyr Leu 
520 

CGA GAT ATG GCC 

Arg Asp Met Ala 
535 



GTT GAT GCC GGT 
Val Asp Ala Gly 

GAT TTC GAC CAG 

Asp Phe Asp Gin 
380 

GAT ATA CGT AAT 

Asp lie Arg Asn 
395 

CGT ATA GCC GAA 

Arg lie Ala Glu 
410 

GAC GGT GGG AGA 

Asp Gly Gly Arg 
425 

AGC ACC GCA GGT 
Ser Thr Ala Gly 

GTC GAG CGA TGG 

Val Glu Arg Trp 
460 

TAC CTG TTT TGC 

Tyr Leu Phe Cys 
475 

ACC AGC CAG CTA 

Thr Ser Gin Leu 
490 

CAT CGA TTG ATT 

His Arg Leu lie 
505 

GCC TGG TCT GGA 

Ala Trp Ser Gly 

CGC GCT GGA GTT 

Arg Ala Gly Val 
540 




GAA CGT GCA AAA 

Glu Arg Ala Lys 
365 

GTT CGT TCA CTC 
Val Arg Ser Leu 

CTG GCA TTT CTG 

Leu Ala Phe Leu 
400 

ATT GCC AGG ATC 

lie Ala Arg lie 
415 

ATG TTA ATC CAT 

Met Leu lie His 
430 

GTA GAG AAG GCA 

Val Glu Lys Ala 
445 

ATT TCC GTC TCT 
lie Ser Val Ser 

CGG GTC AGA AAA 

Arg Val Arg Lys 
480 

TCA ACT CGC GCC 

Ser Thr Arg Ala 
495 

TAC GGC GCT AAG 

Tyr Gly Ala Lys 
510 

CAC AGT GCC CGT 

His Ser Ala Arg 
525 

TCA ATA CCG GAG 
Ser lie Pro Glu 




ATC ATG CAA GCT GGT GGC TGG ACC AAT GTA AAT ATT GTC ATG AAC TAT 
1680 

lie Met Gin Ala Gly Gly Trp Thr Asn Val Asn lie Val Met Asn Tyr 
545 550 555 560 

ATC CGT AAC CTG GAT AGT GAA ACA GGG GCA ATG GTG CGC CTG CTG GAA 
1728 

lie Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val Arg Leu Leu Glu 
565 570 575 



GAT GGC GAT TAG 
1740 

Asp Gly Asp 
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SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ser Pro lie Leu Gly Tyr Trp Lys lie Lys Gly Leu Val Gin Pro 
15 10 15 

Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu 
20 25 30 

Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu 
35 40 45 

Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr lie Asp Gly Asp Val Lys 
50 55 60 

Leu Thr Gin Ser Met Ala He He Arg Tyr He Ala Asp Lys His Asn 
65 70 75 80 

Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu 

85 90 95 

Gly Ala Val Leu Asp He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser 
100 105 HO 

Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu 
115 120 125 

Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn 
130 135 140 

Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp 
145 150 155 160 

Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu 
165 170 175 

Val Cys Phe Lys Lys Arg He Glu Ala He Pro Gin He Asp Lys Tyr 
180 185 190 

Leu Lys Ser Ser Lys Tyr He Ala Trp Pro Leu Gin Gly Trp Gin Ala 
195 200 205 

Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg 
210 215 220 

Gly Ser Arg Arg Ala Ser Val Gly Ser His Met Pro Met Ala Asn Leu 
225 230 235 240 

Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val Asp Ala Thr Ser 
245 250 255 

Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg Asp Arg Gin Ala 
260 265 270 

Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val Cys Arg Ser Trp 



275 



280 



285 



Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe Pro Ala Glu Pro 
290 295 300 

Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala Arg Gly Leu Ala 
305 310 315 320 

Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn Met Leu His Arg 
325 330 335 

Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala Val Ser Leu Val 
340 345 350 

Met Arg Arg He Arg Lys Glu Asn Val Asp Ala Gly Glu Arg Ala Lys 
355 360 365 

Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin Val Arg Ser Leu 
370 375 380 

Met Glu Asn Ser Asp Arg Cys Gin Asp He Arg Asn Leu Ala Phe Leu 
385 390 395 400 

Gly He Ala Tyr Asn Thr Leu Leu Arg He Ala Glu He Ala Arg He 
405 410 415 

Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg Met Leu He His 
420 425 430 

He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly Val Glu Lys Ala 
435 440 445 

Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp He Ser Val Ser 
450 455 460 

Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys Arg Val Arg Lys 
465 470 475 480 

Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu Ser Thr Arg Ala 
485 490 495 

Leu Glu Gly He Phe Glu Ala Thr His Arg Leu He Tyr Gly Ala Lys 
500 505 510 

Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly His Ser Ala Arg 
515 520 525 

Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val Ser He Pro Glu 
530 535 540 

He Met Gin Ala Gly Gly Trp Thr Asn Val Asn He Val Met Asn Tyr 
545 550 555 560 

He Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val Arg Leu Leu Glu 
565 570 575 



Asp Gly Asp 
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